This tutorial walks through using the xlwt library to
create a multi-worksheet Excel document in Python. We’ll be using the
example of calculating our AWS server expenses, and in adition to xlwt,
we’ll also use requests to download the latest EC2 pricing data
I picked this example because I did exactly that earlier this week,
using the much less enjoyable search-copy-paste method. (Which sadly
means my pricing information is already outdated.)
Before we get started, let’s create a virtualenv to install our Python dependencies into.
(If you haven’t used virtualenv before take a gander here to get up to speed, it’s a
straightforward way to manage packages requirements for different projects.)
mkdir -p ~/code/
virtualenv --no-site-packages aws_expenses
Next, install these dependencies into our new virtualenv:
pip install requests==0.14.2 xlwt==0.7.4
Now we can quickly verify that the install succeeded:
OK, so after swimming around in that ocean, we’ve cobbled together enough information to plan out the data we want.
Let’s make three simplifying assumption:
we’re only interested in the us-east region,
we’re only interested in linux machines,
we’re only interested in USD prices.
Within those simplifying assumptions, a reasonable way for us to represent the
data we’re interested in is a list of 3-tuples, with this information:
(instanceType, size, price)
For example, a small standard instance would be represented like:
('stdODI', 'sm', '0.115')
We can accomplish that using this function:
EC2_URL = "http://aws.amazon.com/ec2/pricing/pricing-on-demand-instances.json"
def prices(region_name="us-east", os="linux", currency="USD"):
prices = requests.get(EC2_URL).json
region = next((x for x in prices['config']['regions'] \
if x['region'] == region_name))
price_tuples = 
for instance_type in region['instanceTypes']:
instance_name = instance_type['type']
for size in instance_type['sizes']:
size_name = size['size']
size_price = next((x for x in size['valueColumns'] \
if x['name'] == os))['prices'][currency]
price_tuples.append((instance_name, size_name, size_price))
Now that we can retrieve have the AWS pricing data, let’s move that data into Excel.
Writing AWS Pricing to Excel
Let’s start out by writing an Excel file with just the header row for prices:
wb = xlwt.Workbook()
ws = wb.add_sheet('EC2 Pricing')
headers = ('Type', 'Size', 'Price/hour (USD)', 'Price/month (USD)')
for i, header in enumerate(headers):
We’re using enumerate to track position within the headers–and will do the same for
the actual pricing data–because the xlwt interfaces don’t support append operations.
Now let’s look at the snippet to write our pricing data beneath the headers:
# this is the prices function we wrote above
aws_prices = prices()
# this converts (0,0) to A1, (1,0) to B1 and so on
from xlwt.Utils import rowcol_to_cell
for i, (instance, size, price) in enumerate(aws_prices, start=1):
# this generates the formula C0 * (30 * 24),
# calculating the monthly rate from the hourly rate
hourly_cell = rowcol_to_cell(i, 2)
hours_in_month = 24 * 30
ws.row(i).write(3, xlwt.Formula("%s * %s" % (hourly_cell, hours_in_month)))
Now we’re generating a full Excel spreadsheet from the AWS data, with a working formula.
The final step is creating a second sheet using this data to project expenses.
First, let’s create the headers for our new Instances worksheet:
headers = ('Type', 'Size', 'Num Instances', 'Cost/month (USD)')
for i, header in enumerate(headers):
Next, let’s look at how we’ll create the cross-worksheet reference to the
EC2 Pricing worksheet:
for i, (instance, size, _) in enumerate(aws_prices, start=1):
# instead of cost, we're writing the number of EC2 instances
# multiplying monthly price by number of instances
instances_cell = rowcol_to_cell(i, 2)
monthly_price_cell = rowcol_to_cell(i, 3)
formula = "'EC2 Pricing'!%s * %s" % (monthly_price_cell, instances_cell)
ws.row(i).write(3, xlwt.Formula(formula), style)
Now we can play around with our infrastructure costs just by changing the values
in the Num Instances column. The last piece is to add a row which sums
the total infrastructure cost across all instances.
Using Excel’s SUM function, this ends up being quite simple:
Finally, we need to call this new write_instances function as part
of creating our workbook. Note that you must create the EC2 Pricing
worksheet before populating the Instances worksheet because xlwt
verifies existance of cross-referenced worksheets.
It only validates the worksheet exists–not whether the specific cross-referenced rows exist–so you can
still create Instances before EC2 Pricing as long as
you wait to call write_instances until after both exist.
All together, exercising our functions looks like: