Continuous Integration and Continuous Deployment Table of Content1. Introduction 2. Real-World Case Study: Converting a Poorly Maintained WordPress Site to Hugo 2.1. Setting Up Hugo 2.2. Converting WordPress to Hugo Posts 2.3. Creating an Algolia Index and Updating It 2.4. Orchestrating with a Makefile 2.5. Deploying with AWS CodePipeline 3. Real-World Case Study: Deploying a Python App Engine Application with Google Cloud Build 4. Real-World Case Study: NFSOPS
1. IntroductionA CI system:Clones the codebase for the software under consideration from a source control system such as GitHub, Builds the software into an artifact that can be a binary, a tar archive, or a Docker image, and, very importantly, also runs unit and/or integration tests for the software. A CD system: Deploys the artifacts built by the CI system to a target environment. This deployment can be automated for non-production environments, but usually includes a manual approval step for production. Note: A more advanced type of such systems is a continuous delivery platform, which automates the deployment step to production and is capable of rolling back the deployment based on metrics obtained from monitoring and logging platforms. 2. Real-World Case Study: Converting a Poorly Maintained WordPress Site to HugoAn inventory website is served via a WordPress site that is frequently hacked and performs horribly. The steps to convert it to a solid high performing site is:BackupConvertUpgradeDeploySome of the requirements to consider:It needed to be continuously deployed.It needed to be fast to run and develop against!It should be a static site hosted from a cloud provider.There should be a reasonable workflow for converting from WordPress.It should be possible to create a reasonable search interface using Python. In the end I decided to use Hugo, AWS, and Algolia. You can find the code repo here.The general architecture looked like Figure 1.
Figure 1:Continuous deployment with Hugo
2.1. Setting Up HugoGetting started with Hugo is very straightforward (see the getting started Hugo guide). First, install the software.
brew install hugo
The only thing left to do is to initialize a skeleton Hugo app and install a theme:
hugo new site quickstart
This creates a new site called quickstart. You can build this site again, VERY QUICKLY, by running hugo. This compiles the markdown files to HTML and CSS. 2.2. Converting WordPress to Hugo PostsNext, we converted the WordPress database to JSON via a raw dump. Then we write a Python script to convert this data into Hugo posts in markdown format. Here is that code:
1"""Conversion code of old database fields into markdown example.2 3If you did a database dump of WordPress and then converted it to JSON, you could4tweak this."""5 6import os7import shutil8from category import CAT9from new_picture_products import PICTURES10 11def check_all_category():12 ares = {}13 REC = []14 for pic in PICTURES:15 res = check_category(pic)16 if not res:17 pic["categories"] = "Other"18 REC.append(pic)19 continue20 21 title,key = res22 if key:23 print("FOUND MATCH: TITLE--[%s], CATEGORY--[%s]" %\24 (title, key))25 ares[title]= key26 pic["categories"] = key27 REC.append(pic)28 return ares, REC29 30def check_category(rec):31 32 title = str(rec['title'])33 for key, values in CAT.items():34 print("KEY: %s, VALUE: %s" % (key, values))35 if title in key:36 return title,key37 for val in values:38 if title in val:39 return title,key40 41def move_image(val):42 """Creates a new copy of the uploaded images to img dir"""43 44 source_picture = "static/uploads/%s" % val["picture"]45 destination_dir = "static/img/"46 shutil.copy(source_picture, destination_dir)47 48def new_image_metadata(vals):49 new_paths = []50 for val in vals:51 pic = val['picture'].split("/")[-1:].pop()52 destination_dir = "static/img/%s" % pic53 val['picture'] = destination_dir54 new_paths.append(val)55 return new_paths56 57CAT_LOOKUP = {'2100': 'Foo',58 'a': 'Biz',59 'b': 'Bam',60 'c': 'Bar',61 '1': 'Foobar',62 '2': 'bizbar',63 '3': 'bam'}64 65def write_post(val):66 67 tags = val["tags"]68 date = val["date"]69 title = val["title"]70 picture = val["picture"]71 categories = val["categories"]72 out = """73+++74tags = ["%s"]75categories = ["%s"]76date = "%s"77title = "%s"78banner = "%s"79+++80[![%s](%s)](%s)81 **Product Name**: %s""" %\82 (tags, categories, date, title, picture.lstrip("/"),83 title, picture, picture, title)84 85 filename = "../content/blog/%s.md" % title86 if os.path.exists(filename):87 print("Removing: %s" % filename)88 os.unlink(filename)89 90 with open(filename, 'a') as the_file:91 the_file.write(out)92 93if __name__ == '__main__':94 from new_pic_category import PRODUCT95 for product in PRODUCT:96 write_post(product)
2.3. Creating an Algolia Index and Updating ItThe next step is to write some Python code that creates an Algolia index and syncs it. Algolia is a great tool to use because it quickly solves the search engine problem and has nice Python support as well. This script crawls through all of the markdown files and generates a search index that can be uploaded to Algolia:
1"""2Creates a very simple JSON index for Hugo to import into Algolia. Easy to extend.3 4#might be useful to run this on content directory to remove spaces5for f in *\ *; do mv "$f" "${f// /_}"; done6 7"""8import os9import json10 11CONTENT_ROOT = "../content/products"12CONFIG = "../config.toml"13INDEX_PATH = "../index.json"14 15def get_base_url():16 for line in open(CONFIG):17 if line.startswith("baseurl"):18 url = line.split("=")[-1].strip().strip('""')19 return url20 21def build_url(base_url, title):22 23 url = "<a href='%sproducts/%s'>%s</a>" %\24 (base_url.strip(), title.lower(), title)25 return url26 27def clean_title(title):28 title_one = title.replace("_", " ")29 title_two = title_one.replace("-", " ")30 title_three = title_two.capitalize()31 return title_three32 33def build_index():34 baseurl = get_base_url()35 index =[]36 posts = os.listdir(CONTENT_ROOT)37 for line in posts:38 print("FILE NAME: %s" % line)39 record = {}40 title = line.strip(".md")41 record['url'] = build_url(baseurl, title)42 record['title'] = clean_title(title)43 print("INDEX RECORD: %s" % record)44 index.append(record)45 return index46 47def write_index():48 index = build_index()49 with open(INDEX_PATH, 'w') as outfile:50 json.dump(index,outfile)51 52if __name__ == '__main__':53 write_index()
Finally, the index can be sent to Algolia with this snippet:
1import json2from algoliasearch import algoliasearch3 4def update_index():5 """Deletes index, then updates it"""6 print("Starting Updating Index")7 client = algoliasearch.Client("YOUR_KEY", "YOUR_VALUE")8 index = client.init_index("your_INDEX")9 print("Clearing index")10 index.clear_index()11 print("Loading index")12 batch = json.load(open('../index.json'))13 index.add_objects(batch)14 15if __name__ == '__main__':16 update_index()
2.4. Orchestrating with a MakefileUsing a Makefile allows you to replicate the steps your deployment process will use later. We typically set up a Makefile to orchestrate this locally. Here is what the entire build and deploy process looks like:
build: rm -rf public hugo watch: clean hugo server -w create-index: cd algolia;python make_algolia_index.py;cd .. update-index: cd algolia;python sync_algolia_index.py;cd .. make-index: create-index update-index clean: -rm -rf public sync: aws s3 --profile <yourawsprofile> sync --acl \ "public-read" public/ s3://example.com build-deploy-local: build sync all: build-deploy-local
2.5. Deploying with AWS CodePipelineAmazon Web Services (AWS) is a common deployment target for hosting a static website via Amazon S3, Amazon Route 53, and Amazon CloudFront. AWS CodePipeline, their build server service, works very well as the deployment mechanism for these sites. You can log into AWS CodePipeline, set up a new build project, and tell it to use a buildspec.yml file. The code can be customized and the portions that are templated out can be replaced with actual values.As soon as GitHub gets a change event, CodePipeline runs the install in a container. * First it grabs the specific version of Hugo specified. * Next it builds the Hugo pages. Thousands of Hugo pages can be rendered subsecond because of the speed of Go.Finally, the HTML pages are synced to Amazon S3. Because this is running inside of AWS and is synced, it is also extremely fast. The final step is that CloudFront is invalidated:
version: 0.1 environment_variables: plaintext: HUGO_VERSION: "0.42" phases: install: commands: - cd /tmp - wget https://github.com/gohugoio/hugo/releases/\ download/v${HUGO_VERSION}/hugo_${HUGO_VERSION}_Linux-64bit.tar.gz - tar -xzf hugo_${HUGO_VERSION}_Linux-64bit.tar.gz - mv hugo /usr/bin/hugo - cd - - rm -rf /tmp/* build: commands: - rm -rf public - hugo post_build: commands: - aws s3 sync public/ s3://<yourwebsite>.com/ --region us-west-2 --delete - aws s3 cp s3://<yourwebsite>.com/\ s3://<yourwebsite>.com/ --metadata-directive REPLACE \ --cache-control 'max-age=604800' --recursive - aws cloudfront create-invalidation --distribution-id=<YOURID> --paths '/*' - echo Build completed on `date`
3. Real-World Case Study: Deploying a Python App Engine Application with Google Cloud BuildThe Google Cloud Platform (GCP) Cloud Build works a lot like AWS CodePipeline. Here is a config file that is checked into a GitHub repo. The config file is named cloudbuild.yaml. You can see all of the source code for this project in this Git repository.
steps:- name: python:3.7 id: INSTALL entrypoint: python3 args: - '-m' - 'pip' - 'install' - '-t' - '.' - '-r' - 'requirements.txt'- name: python:3.7 entrypoint: ./pylint_runner id: LINT waitFor: - INSTALL- name: "gcr.io/cloud-builders/gcloud" args: ["app", "deploy"]timeout: "1600s"images: ['gcr.io/$PROJECT_ID/pylint']
Note that the cloudbuild.yaml file installs the packages seen here in the requirements.txt file.It also runs gcloud app deploy, which deploys the App Engine application on check-in to GitHub:
Flask==1.0.2gunicorn==19.9.0pylint==2.3.1
Here is a walk-through of how to set up this entire project:1. Create the project.2. Activate the cloud shell.3. Refer to the hello world docs for the Python 3 App Engine.4. Run describe:4. 1.
verify project is working```bashgcloud projects describe $GOOGLE_CLOUD_PROJECT```output of command:```bashcreateTime: '2019-05-29T21:21:10.187Z'lifecycleState: ACTIVEname: hellomlprojectId: helloml-xxxxxprojectNumber: '881692383648'```
5. You may want to verify that you have the correct project. If not, do this to switch → gcloud config set project $GOOGLE_CLOUD_PROJECT6. Create the App Engine app → gcloud app create* This will ask for the region. Go ahead and pick us-central [12].* *
Creating App Engine application in project [helloml-xxx]and region [us-central]....done.Success! The app is now created.Please use `gcloud app deploy` to deploy your first app.
* 7. Clone the hello world sample app repo and cd into the repo:7. 7.
git clone https://github.com/GoogleCloudPlatform/python-docs-samples
7. 8. Update the Cloudshell image (note that this is optional):8. 8.
git clone https://github.com/noahgift/gcp-hello-ml.git# Update .cloudshellcustomimagerepo.json with project and image name# TIP: enable "Boost Mode" in in Cloudshellcloudshell env build-localcloudshell env pushcloudshell env update-default-image# Restart Cloudshell VM
8. 9. Create and source the virtual environment:9. 9.
virtualenv --python $(which python) venvsource venv/bin/activate
10. Activate the cloud shell editor and install the required packages → pip install -r requirements.txt* This should install Flask → Flask==1.0.211. Run Flask locally. This runs Flask locally in the GCP shell → python main.py12. Use the web preview:12.
Figure 2:Web preview
1. 13. Update main.py:13. 13.
from flask import Flaskfrom flask import jsonify app = Flask(__name__) @app.route('/')def hello(): """Return a friendly HTTP greeting.""" return 'Hello I like to make AI Apps' @app.route('/name/<value>')def name(value): val = {"value": value} return jsonify(val) if __name__ == '__main__': app.run(host='127.0.0.1', port=8080, debug=True)
13. 14. Test out passing in parameters to exercise this function:14. 14.
@app.route('/name/<value>')def name(value): val = {"value": value} return jsonify(val)
14. * For example, calling this route will take the word lion and pass it into the name function in Flask → https://8080-dot-3104625-dot-devshell.appspot.com/name/lion* Returns a value in the web browser → {value: "lion"}* 15. Now, deploy the app → gcloud app deploy Be warned! The first deploy could take about 10 minutes. You might also need to enable the cloud build API.
Do you want to continue (Y/n)? yBeginning deployment of service [default]...╔════════════════════════════════════════════════════════════╗╠═ Uploading 934 files to Google Cloud Storage ═╣
16. Now stream the log files → gcloud app logs tail -s default16. 17. The production app is deployed and should like this:17. 17.
Setting traffic split for service [default]...done.Deployed service [default] to [https://helloml-xxx.appspot.com]You can stream logs from the command line by running: $ gcloud app logs tail -s default $ gcloud app browse(venv) noah_gift@cloudshell:~/python-docs-samples/appengine/\ standard_python37/hello_world (helloml-242121)$ gcloud app logs tail -s defaultWaiting for new log entries...2019-05-29 22:45:02 default[2019] [2019-05-29 22:45:02 +0000] [8]2019-05-29 22:45:02 default[2019] [2019-05-29 22:45:02 +0000] [8] (8)2019-05-29 22:45:02 default[2019] [2019-05-29 22:45:02 +0000] [8]2019-05-29 22:45:02 default[2019] [2019-05-29 22:45:02 +0000] [25]2019-05-29 22:45:02 default[2019] [2019-05-29 22:45:02 +0000] [27]2019-05-29 22:45:04 default[2019] "GET /favicon.ico HTTP/1.1" 4042019-05-29 22:46:25 default[2019] "GET /name/usf HTTP/1.1" 200
17. 18. Add a new route and test it out:18. 18.
@app.route('/html')def html(): """Returns some custom HTML""" return """ <title>This is a Hello World World Page</title> <p>Hello</p> <p><b>World</b></p> """
18. 19. Install Pandas and return JSON results. At this point, you may want to consider creating a Makefile and doing this:19. 19.
touch Makefile#this goes inside that fileinstall: pip install -r requirements.txt
19. i. You also may want to set up lint:i.
pylint --disable=R,C main.py------------------------------------Your code has been rated at 10.00/10
The web route syntax looks like the following block. Add Pandas import at the top:
import pandas as pd @app.route('/pandas')def pandas_sugar(): df = pd.read_csv( "https://raw.githubusercontent.com/noahgift/sugar/\ master/data/education_sugar_cdc_2003.csv") return jsonify(df.to_dict())
When you call the route https://<yourapp>.appspot.com/pandas, you should get something like Figure 3.
Figure 3:Example of JSON output
20. Add this Wikipedia route:20. 20.
import wikipedia@app.route('/wikipedia/<company>')def wikipedia_route(company): result = wikipedia.summary(company, sentences=10) return result
20. 21. Add NLP to the app:* Run IPython Notebook.* Enable the Cloud Natural Language API.* Run pip install google-cloud-language:* *
In [1]: from google.cloud import language ...: from google.cloud.language import enums ...: ...: from google.cloud.language import typesIn [2]:In [2]: text = "LeBron James plays for the Cleveland Cavaliers." ...: client = language.LanguageServiceClient() ...: document = types.Document( ...: content=text, ...: type=enums.Document.Type.PLAIN_TEXT) ...: entities = client.analyze_entities(document).entitiesIn [3]: entities
* 22. Here is an end-to-end AI API example:22. 22.
from flask import Flaskfrom flask import jsonifyimport pandas as pdimport wikipedia app = Flask(__name__) @app.route('/')def hello(): """Return a friendly HTTP greeting.""" return 'Hello I like to make AI Apps' @app.route('/name/<value>')def name(value): val = {"value": value} return jsonify(val) @app.route('/html')def html(): """Returns some custom HTML""" return """ <title>This is a Hello World World Page</title> <p>Hello</p> <p><b>World</b></p> """@app.route('/pandas')def pandas_sugar(): df = pd.read_csv( "https://raw.githubusercontent.com/noahgift/sugar/\ master/data/education_sugar_cdc_2003.csv") return jsonify(df.to_dict()) @app.route('/wikipedia/<company>')def wikipedia_route(company): # Imports the Google Cloud client library from google.cloud import language from google.cloud.language import enums from google.cloud.language import types result = wikipedia.summary(company, sentences=10) client = language.LanguageServiceClient() document = types.Document( content=result, type=enums.Document.Type.PLAIN_TEXT) entities = client.analyze_entities(document).entities return str(entities) if __name__ == '__main__': app.run(host='127.0.0.1', port=8080, debug=True)
This section has shown how to both set up an App Engine application from scratch in the Google Cloud Shell, as well as how to do continuous delivery using GCP Cloud Build. 4. Real-World Case Study: NFSOPSNFOPS is an operational technique that uses NFS (Network File System) mount points to manage clusters of computers. It sounds like it is a new thing, but it has been around since Unix has been around. As a part-time consultant at a virtual reality startup in San Francisco, one problem I faced was how to build a jobs framework, quickly, that would dispatch work to thousands of AWS Spot Instances.The solution that ultimately worked was to use NFSOPS (Figure 4) to deploy Python code to thousands of Computer Vision Spot Instances subsecond.
Figure 4:NFSOPS
NFSOPS works by using a build server, in this case Jenkins, to mount several Amazon Elastic File System (EFS) mount points (DEV, STAGE, PROD). When a continuous integration build is performed, the final step is an rsync to the respective mount point:
#Jenkins deploy build steprsync -az --delete * /dev-efs/code/
The “deploy” is then subsecond to the mount point. When spot instances are launched by the thousands, they are pre-configured to mount EFS (the NFS mount points) and use the source code. This is a handy deployment pattern that optimizes for simplicity and speed. It can also work quite well in tandom with IAC, Amazon Machine Image (AMI), or Ansible.