Green coding with Python
At PyConZA 2023 on October 5th, I had the pleasure of discussing a topic close to my heart and increasingly vital for our planet: Green coding with Python.
A Few Caveats
Before we dive in, it's important to set a few things straight:
- I am not an environmental scientist. My perspective comes from years of software engineering and a growing concern for our planet.
- This is a complex and evolving field. What works best today may not be the best tomorrow.
- The answers are often counter-intuitive and require us to look beyond surface-level assumptions.
- As the saying goes, "Perfection is the enemy of good". Small, consistent changes can make a big difference.
Why Should You Care?
The images we see of extreme weather events, like the recent flooding in parts of South Africa, are stark reminders of climate change.
This isn't just an abstract problem; it has real-world consequences. The United Nations' Sustainable Development Goals (SDGs) provide a global framework for a more sustainable future. These 17 interconnected goals address everything from poverty and hunger to climate action and clean energy.
Unfortunately, progress towards many of these goals, particularly those related to the environment, is stagnating or even regressing in some areas. For example, a recent report might show that while some SDGs are on track (green arrows), many face significant challenges (orange/red arrows or stagnating dots), especially around climate action, life below water, and life on land.
This is where green coding comes in. I define it as:
"Green coding is an environmentally sustainable computing practice, seeking to minimise the environmental impact of software."
It’s about being conscious of the resources our software consumes and making choices to reduce that consumption.
What is Running? (The Code Itself)
It's time to address the elephant in the room.
Python isn’t typically known for efficiency or raw performance.
While performance is important, it isn't the end-all, and hopefully by the end of this blog post I can convince you that we can still write efficient and green services while using Python.
That said, let’s look at how we can reduce the energy usage of our Python code.
- Measure: You can't improve what you don't measure. Use profiling tools (
cProfile
,Scalene
,py-spy
,Sciagraph
) to identify hotspots in your Python code - where is it spending the most time and consuming the most resources, and focus your effort there. - Algorithms and Data Structures: The choice of algorithm can have a massive impact. An O(n2) algorithm will consume vastly more resources for large inputs than an O(n log n) or O(n) solution. Similarly, using the right data structure (e.g. a set for fast look-ups vs. a list) can significantly improve performance and reduce processing time.
- Caching: If you're computing something expensive repeatedly, cache the result. This avoids redundant computations, saving CPU cycles and energy. Caching does come with its own issues, and can be a source of bugs, so use it carefully and only when you've explored other avenues first.
- Libraries: Leverage optimised libraries. Software engineers have focused a lot on optimising libraries like
numpy
; leverage this work instead of trying to optimise your own custom code. - PyPy: Consider using PyPy, an alternative Python interpreter with a Just-In-Time (JIT) compiler. For many CPU-bound Python programs, PyPy can offer significant speed-ups with no code changes.
- Resources: For more in-depth Python performance tips, I highly recommend checking out pythonspeed.com.
Where is it Running? (The Infrastructure)
The environmental impact isn't just about the code; it's also about where and how it runs.
Embodied Energy
This refers to the total energy consumed during a product's entire lifecycle – from raw material extraction, manufacturing, transportation, and usage, to its disposal. We must consider not only the emissions from running our services, but also the emissions to create the hardware for running our services.
Operational Energy – The Netflix Example
Let's look at streaming a Netflix show. Where does the energy go?
Surprisingly, three quarters of the energy is on the user's end. Also, only 6.4% of the energy usage is under Netflix's direct control (their servers)
This shows that optimising the user’s end can significantly impact overall energy consumption, but also that there's often a lot of energy usage out of your direct control. This can be reduced, e.g. better compression algorithm to reduce bandwidth usage, but doesn't guarantee lower emissions, e.g. the decompression algorithm uses more energy to decompress, increasing the largest source of emissions, the TV.
CO2e Emissions
Carbon dioxide equivalent (CO2e) is a metric used to compare the emissions from various greenhouse gases based on their global warming potential (GWP) by converting amounts of other gases to the equivalent amount of carbon dioxide with the same GWP.
The CO2 equivalent (CO2e) emissions for the same activity, like watching 30 minutes of Netflix, vary wildly by country. This is largely due to the energy grid's carbon intensity.
South Africa relies heavily on coal, so emissions are high. France has a high percentage of nuclear power, resulting in very low emissions for the same activity.
Server Region Choice
If you're deploying to the cloud, choosing your server region matters! Some AWS regions, for example, are powered by a higher percentage of renewable energy than others, some at 100% renewable energy (although it must be noted that this is often achieved through renewable energy credits rather than direct power sourcing).
If it's possible for your application, by choosing a region with a cleaner energy mix, you can reduce the operational carbon footprint of your application, with very little impact to your users.
Who is Creating It? (The Developers & Our Choices)
Our choices as developers, and as consumers, have an impact. Sometimes, these impacts are counterintuitive.
The Almond Milk Conundrum
You might have seen headlines like "When One Almond Gulps 3.2 Gallons of Water" or "Almond milk: quite good for you – very bad for the planet." These headlines suggest that almonds are extremely water-intensive.
Let's do a quick quiz. Without looking up the answer, or reading the correct answer below, think about which answer seems like the intuitively correct answer for you. You can also click on the question if you want to submit your answer.
How much more water does almond milk require to produce compared to cow's milk?
A. 3.9 times
B. 1.8 times
C. 1.0 times (the same)
D. 0.6 times (i.e., 40% less)]
The answer, surprisingly, is D. Almond milk uses about 40% less water (0.6 times) and 78% less CO2e than cow's milk.
The headlines often focus on the water per almond, but when you look at the entire lifecycle and compare it to the massive footprint of dairy farming (land use, methane from cows, water for feed and animals), plant-based alternatives generally come out ahead.
This highlights that we need to look at the bigger picture and rely on data, not just catchy headlines. A quote from Joseph Poore, the lead author of the Oxford study, emphasises this:
"A vegan diet is probably the single biggest way to reduce your impact on planet Earth, not just greenhouse gases, but global acidification, eutrophication, land use and water use."
Our Broader Impact
As individuals and as a society, our emissions come from various sectors. Electricity and heat generation are major contributors, especially in South Africa due to our coal dependency. Transport is another significant factor.
The key takeaway here is to estimate where the biggest impacts are and start with the biggest levers you can pull, whether in your code, your infrastructure choices, or even your personal consumption patterns.
In most cases, working from home can greatly reduce emissions for your organisation, but it's important to calculate and see if it is really the case.
Summary: What, Where, Who
To recap our green coding journey:
- What (The Code):
- Measure performance to find bottlenecks.
- Choose efficient algorithms and data structures.
- Consider PyPy or optimised libraries for CPU-intensive tasks.
- Where (The Infrastructure):
- Be mindful of server regions and their energy mix.
- Remember that user devices can contribute significantly to energy usage.
- Don't forget embodied energy when considering hardware life-cycles.
- Who (The Creators):
- Estimate where your biggest environmental impacts lie.
- Start with the biggest changes you can make – often, these are choices related to diet, transport, or energy consumption at home/office, which then indirectly affect the "embodied energy" of the developers creating the software.
Green coding is a journey, not a destination. Start by measuring, stay curious, and don’t let the perfect be the enemy of the good. If every developer made one greener choice this week, the cumulative impact would be enormous.