Carbon Emissions Impact Analysis with Python

Climate change is one of the most critical challenges of our time, with rising carbon emissions playing a pivotal role in driving global temperature anomalies. Analyzing the relationship between CO₂ concentrations and temperature changes provides valuable insights into the underlying patterns and trends that shape our planet’s climate. So, in this article, I’ll take you through a detailed analysis of the impact of carbon emissions worldwide. We will use Python to analyze the impact of carbon emissions.

Carbon Emissions Impact Analysis: Overview and Dataset

In this analysis, we aim to explore the impact of carbon emissions on global temperatures. The focus will be on identifying historical trends, detecting anomalies, and simulating potential future scenarios to understand how changes in CO₂ concentrations influence temperature anomalies.

For this, we will work with two primary datasets:

  1. One containing annual temperature changes across multiple countries and decades,
  2. Another featuring monthly CO₂ concentrations measured globally.

By combining these datasets, we will investigate correlations, detect patterns over decades, and use predictive modelling to simulate “what-if” scenarios. This data-driven approach will provide actionable insights into the connection between carbon emissions and climate change, which can offer valuable context for sustainable policy-making. You can download the dataset from here.

Carbon Emissions Impact Analysis with Python

Now, let’s get started with the task of Carbon Emissions Impact Analysis by importing the necessary Python libraries and the dataset:

import pandas as pd

# load the datasets
temperature_data = pd.read_csv('/content/temperature.csv')
co2_data = pd.read_csv('/content/carbon_emmission.csv')

temperature_data_preview = temperature_data.head()
co2_data_preview = co2_data.head()

temperature_data_preview, co2_data_preview
(   ObjectId                       Country ISO2 ISO3  F1961  F1962  F1963  \
0 1 Afghanistan, Islamic Rep. of AF AFG -0.113 -0.164 0.847
1 2 Albania AL ALB 0.627 0.326 0.075
2 3 Algeria DZ DZA 0.164 0.114 0.077
3 4 American Samoa AS ASM 0.079 -0.042 0.169
4 5 Andorra, Principality of AD AND 0.736 0.112 -0.752

F1964 F1965 F1966 ... F2013 F2014 F2015 F2016 F2017 F2018 F2019 \
0 -0.764 -0.244 0.226 ... 1.281 0.456 1.093 1.555 1.540 1.544 0.910
1 -0.166 -0.388 0.559 ... 1.333 1.198 1.569 1.464 1.121 2.028 1.675
2 0.250 -0.100 0.433 ... 1.192 1.690 1.121 1.757 1.512 1.210 1.115
3 -0.140 -0.562 0.181 ... 1.257 1.170 1.009 1.539 1.435 1.189 1.539
4 0.308 -0.490 0.415 ... 0.831 1.946 1.690 1.990 1.925 1.919 1.964

F2020 F2021 F2022
0 0.498 1.327 2.012
1 1.498 1.536 1.518
2 1.926 2.330 1.688
3 1.430 1.268 1.256
4 2.562 1.533 3.243

[5 rows x 66 columns],
ObjectId Country Date Value
0 1 World 1958M03 315.70
1 2 World 1958M04 317.45
2 3 World 1958M05 317.51
3 4 World 1958M06 317.24
4 5 World 1958M07 315.86)

We are using two datasets:

  1. Temperature Data: Annual temperature anomalies measured in degrees Celsius across decades.
  2. CO₂ Data: Monthly global atmospheric CO₂ concentrations in parts per million (ppm).

Now, let’s calculate key statistics for temperature changes and CO₂ concentrations, such as mean, median, and variance:

# selecting and computing statistics for temperature changes
temperature_values = temperature_data.filter(regex='^F').stack()  # extracting all year columns
temperature_stats = {
    "Mean": temperature_values.mean(),
    "Median": temperature_values.median(),
    "Variance": temperature_values.var()
}

# computing statistics for CO2 concentrations
co2_values = co2_data["Value"]  # extracting the Value column
co2_stats = {
    "Mean": co2_values.mean(),
    "Median": co2_values.median(),
    "Variance": co2_values.var()
}

temperature_stats, co2_stats
({'Mean': 0.5377713483146068, 'Median': 0.47, 'Variance': 0.4294524831504378},
{'Mean': 180.71615286624203,
'Median': 313.835,
'Variance': 32600.00200469294})

The mean temperature change is approximately 0.54°C, with a median of 0.47°C and a variance of 0.43, indicating slight variability in temperature anomalies. For CO₂ concentrations, the mean is 180.72 ppm, the median is significantly higher at 313.84 ppm, and the variance is 32,600, which reflects substantial variability in CO₂ levels over the dataset’s timeframe. This highlights the stronger fluctuation in CO₂ data compared to temperature changes.

Time-Series Analysis

Next, we’ll examine how temperature changes and CO₂ concentrations have evolved overtime and the relationships between them:

import plotly.graph_objects as go
import plotly.express as px

# extracting time-series data for plotting
# temperature: averaging across countries for each year
temperature_years = temperature_data.filter(regex='^F').mean(axis=0)
temperature_years.index = temperature_years.index.str.replace('F', '').astype(int)

# CO2: parsing year and averaging monthly data
co2_data['Year'] = co2_data['Date'].str[:4].astype(int)
co2_yearly = co2_data.groupby('Year')['Value'].mean()

# time-series plot for temperature and CO2 levels
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=temperature_years.index, y=temperature_years.values,
    mode='lines+markers', name="Temperature Change (°C)"
))
fig.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_yearly.values,
    mode='lines+markers', name="CO₂ Concentration (ppm)", line=dict(dash='dash')
))
fig.update_layout(
    title="Time-series of Temperature Change and CO₂ Concentrations",
    xaxis_title="Year",
    yaxis_title="Values",
    template="plotly_white",
    legend_title="Metrics"
)
fig.show()

# correlation heatmap
merged_data = pd.DataFrame({
    "Temperature Change": temperature_years,
    "CO₂ Concentration": co2_yearly
}).dropna()

heatmap_fig = px.imshow(
    merged_data.corr(),
    text_auto=".2f",
    color_continuous_scale="RdBu",  # diverging colormap similar to coolwarm
    title="Correlation Heatmap"
)
heatmap_fig.update_layout(
    template="plotly_white"
)
heatmap_fig.show()

# scatter plot: temperature vs CO2 concentrations
scatter_fig = px.scatter(
    merged_data,
    x="CO₂ Concentration", y="Temperature Change",
    labels={"CO₂ Concentration": "CO₂ Concentration (ppm)", "Temperature Change": "Temperature Change (°C)"},
    title="Temperature Change vs CO₂ Concentration",
    template="plotly_white"
)
scatter_fig.update_traces(marker=dict(size=10, opacity=0.7))
scatter_fig.show()
Time-series of Temperature Change and CO₂ Concentrations

The time-series graph shows a consistent increase in CO₂ concentrations (measured in ppm) over the years, which indicates the accumulation of greenhouse gases in the atmosphere. Simultaneously, a slight upward trend in global temperature change suggests that rising CO₂ levels are associated with global warming. The temporal alignment supports the hypothesis of CO₂’s significant contribution to temperature increase.

Carbon Emissions Impact Analysis: Correlation Heatmap

The heatmap reveals a strong positive correlation (0.96) between CO₂ concentrations and temperature changes. This statistical relationship reinforces the observation that higher CO₂ levels are closely linked with increasing global temperatures, which highlights the importance of addressing carbon emissions to mitigate climate change.

Temperature Change vs CO₂ Concentration

The scatter plot shows a clear linear trend, where higher CO₂ concentrations correspond to greater temperature changes. This visual evidence underscores the direct relationship between CO₂ emissions and global warming, which provides further support for policies targeting reductions in carbon emissions to combat climate impacts.

Trends and Seasonal Variations Analysis

Now, let’s identify long-term trends and seasonal variations in the data using linear regression:

from scipy.stats import linregress

# temperature trend
temp_trend = linregress(temperature_years.index, temperature_years.values)
temp_trend_line = temp_trend.slope * temperature_years.index + temp_trend.intercept

# CO2 trend
co2_trend = linregress(co2_yearly.index, co2_yearly.values)
co2_trend_line = co2_trend.slope * co2_yearly.index + co2_trend.intercept

fig_trends = go.Figure()

fig_trends.add_trace(go.Scatter(
    x=temperature_years.index, y=temperature_years.values,
    mode='lines+markers', name="Temperature Change (°C)"
))
fig_trends.add_trace(go.Scatter(
    x=temperature_years.index, y=temp_trend_line,
    mode='lines', name=f"Temperature Trend (Slope: {temp_trend.slope:.2f})", line=dict(dash='dash')
))
fig_trends.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_yearly.values,
    mode='lines+markers', name="CO₂ Concentration (ppm)"
))
fig_trends.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_trend_line,
    mode='lines', name=f"CO₂ Trend (Slope: {co2_trend.slope:.2f})", line=dict(dash='dash')
))

fig_trends.update_layout(
    title="Trends in Temperature Change and CO₂ Concentrations",
    xaxis_title="Year",
    yaxis_title="Values",
    template="plotly_white",
    legend_title="Metrics"
)
fig_trends.show()

# seasonal variations in CO2 concentrations
co2_data['Month'] = co2_data['Date'].str[-2:].astype(int)
co2_monthly = co2_data.groupby('Month')['Value'].mean()

fig_seasonal = px.line(
    co2_monthly,
    x=co2_monthly.index,
    y=co2_monthly.values,
    labels={"x": "Month", "y": "CO₂ Concentration (ppm)"},
    title="Seasonal Variations in CO₂ Concentrations",
    markers=True
)
fig_seasonal.update_layout(
    xaxis=dict(tickmode="array", tickvals=list(range(1, 13))),
    template="plotly_white"
)
fig_seasonal.show()
Carbon Emissions Impact Analysis: Trends in Temperature Change and CO₂ Concentrations

The graph shows the linear trends in both temperature change and CO₂ concentrations over time, represented by their respective slopes. The CO₂ trend has a much steeper slope (0.32) compared to temperature (0.03), which indicates a faster rate of increase in CO₂ emissions relative to temperature change. This suggests that while CO₂ levels are rising rapidly, the temperature impact, though slower, is accumulating steadily and may have long-term consequences.

Seasonal Variations in CO₂ Concentrations

The above graph highlights the seasonal fluctuations in CO₂ concentrations, which peak during late spring and early summer (around May) and reach the lowest levels in fall (around September). These variations are likely due to natural processes such as plant photosynthesis, which absorbs CO₂ during the growing season, and respiration, which releases CO₂ in the off-season. This seasonal cycle underscores the role of natural carbon sinks in moderating atmospheric CO₂ levels.

Correlation and Causality Analysis

To quantify the relationship between CO₂ and temperature anomalies, we will now compute Pearson and Spearman correlation coefficients. And to investigate whether changes in CO₂ cause temperature anomalies, we will perform Granger Causality tests:

from scipy.stats import pearsonr, spearmanr
from statsmodels.tsa.stattools import grangercausalitytests

# pearson and spearman correlation coefficients
pearson_corr, _ = pearsonr(merged_data["CO₂ Concentration"], merged_data["Temperature Change"])
spearman_corr, _ = spearmanr(merged_data["CO₂ Concentration"], merged_data["Temperature Change"])

# granger causality test
granger_data = merged_data.diff().dropna()  # first differencing to make data stationary
granger_results = grangercausalitytests(granger_data, maxlag=3, verbose=False)

# extracting p-values for causality
granger_p_values = {f"Lag {lag}": round(results[0]['ssr_chi2test'][1], 4)
                    for lag, results in granger_results.items()}

pearson_corr, spearman_corr, granger_p_values
(0.9554282559257312,
0.9379013371609882,
{'Lag 1': 0.0617, 'Lag 2': 0.6754, 'Lag 3': 0.2994})

Pearson Correlation (0.9554) indicates a very strong linear relationship between CO₂ concentrations and temperature changes. Spearman Correlation (0.9379) indicates a very strong monotonic relationship between CO₂ concentrations and temperature changes.

Granger Causality Test: The p-values for lags 1, 2, and 3 are as follows:

  • Lag 1: 0.0617 (slightly above the common significance threshold of 0.05, suggesting weak evidence for causality).
  • Lag 2: 0.6754 (not significant, no evidence of causality).
  • Lag 3: 0.2994 (not significant, no evidence of causality).

There is a very strong correlation between CO₂ concentrations and temperature changes. However, Granger Causality tests do not provide strong evidence that changes in CO₂ concentrations directly cause changes in temperature within the lags tested.

Lagged Effects Analysis

Now, we will analyze whether CO₂ concentrations from previous years (lagged values) influence current temperature anomalies. To do this, we will create lagged variables for CO₂ concentrations, specifically shifting the data by 1, 2, and 3 years. These lagged values will allow us to test if historical CO₂ levels have a delayed impact on temperature changes.

After creating these lagged variables, we will fit an Ordinary Least Squares (OLS) regression model. This model will use current and lagged CO₂ levels as predictors to estimate their contribution to current temperature anomalies. By examining the regression results, we will determine:

  1. How strongly current CO₂ levels affect temperature changes.
  2. Whether CO₂ levels from previous years have a significant impact
import statsmodels.api as sm

# creating lagged CO2 data to investigate lagged effects
merged_data['CO₂ Lag 1'] = merged_data["CO₂ Concentration"].shift(1)
merged_data['CO₂ Lag 2'] = merged_data["CO₂ Concentration"].shift(2)
merged_data['CO₂ Lag 3'] = merged_data["CO₂ Concentration"].shift(3)

# dropping rows with NaN due to lags
lagged_data = merged_data.dropna()

X = lagged_data[['CO₂ Concentration', 'CO₂ Lag 1', 'CO₂ Lag 2', 'CO₂ Lag 3']]
y = lagged_data['Temperature Change']
X = sm.add_constant(X)  # adding a constant for intercept

model = sm.OLS(y, X).fit()

model_summary = model.summary()
model_summary
Carbon Emissions Impact Analysis: Lagged Effects Analysis

The OLS regression results indicate a strong relationship between CO₂ concentration and temperature change, with an R-squared value of 0.949, meaning 94.9% of the variance in temperature change is explained by the model. The coefficient for CO₂ concentration (0.3245) is statistically significant (p < 0.05), which suggests a positive association between CO₂ levels and temperature change.

Clustering Climate Patterns

Next, we group years based on similarities in temperature anomalies and CO₂ concentrations using K-Means clustering:

from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import numpy as np

# preparing the data for clustering
clustering_data = merged_data[["Temperature Change", "CO₂ Concentration"]].dropna()

scaler = StandardScaler()
scaled_data = scaler.fit_transform(clustering_data)

# applying K-Means clustering
kmeans = KMeans(n_clusters=3, random_state=42)  # assuming 3 clusters for simplicity
clustering_data['Cluster'] = kmeans.fit_predict(scaled_data)

# adding labels for periods with similar climate patterns
clustering_data['Label'] = clustering_data['Cluster'].map({
    0: 'Moderate Temp & CO₂',
    1: 'High Temp & CO₂',
    2: 'Low Temp & CO₂'
})

import plotly.express as px

fig_clusters = px.scatter(
    clustering_data,
    x="CO₂ Concentration",
    y="Temperature Change",
    color="Label",
    color_discrete_sequence=px.colors.qualitative.Set2,
    labels={
        "CO₂ Concentration": "CO₂ Concentration (ppm)",
        "Temperature Change": "Temperature Change (°C)",
        "Label": "Climate Pattern"
    },
    title="Clustering of Years Based on Climate Patterns"
)

fig_clusters.update_layout(
    template="plotly_white",
    legend_title="Climate Pattern"
)

fig_clusters.show()
Clustering of Years Based on Climate Patterns

The clustering graph segments years into three distinct climate patterns based on CO₂ concentration and temperature change: low CO₂ and temperature (green), moderate CO₂ and temperature (orange), and high CO₂ and temperature (blue). The progression from green to orange and then to blue clusters reflects a clear trend of increasing temperature change corresponding to rising CO₂ levels, effectively illustrating the correlation between greenhouse gas concentrations and global temperature variations.

This clustering emphasizes the cumulative and escalating impact of carbon emissions on global temperature patterns, which illustrates the need for targeted interventions to mitigate future increases.

Predicting Temperature Changes Under What If Analysis

Now, we will use a simple linear regression model to simulate how changes in CO₂ concentrations might influence global temperatures. By leveraging the historical relationship between CO₂ concentrations and temperature anomalies, this model allows us to predict the potential impact of different emission scenarios.

First, we will train a linear regression model with CO₂ concentrations as the input and temperature anomalies as the output. Once the model is trained, we can simulate hypothetical scenarios where CO₂ concentrations increase or decrease by a specific percentage.

For each scenario, we will adjust the current average CO₂ concentration by the specified percentage, feed it into the model, and predict the corresponding temperature anomaly. The scenarios we simulate include:

  1. Increase CO₂ by 10%: Predict the rise in temperature anomalies.
  2. Decrease CO₂ by 10%: Estimate the cooling effect.
  3. Increase CO₂ by 20%: Analyze the impact of more aggressive emissions growth.
  4. Decrease CO₂ by 20%: Evaluate the benefit of significant emission reductions.
# setting up a simple predictive model using linear regression
from sklearn.linear_model import LinearRegression

# Preparing data
X = merged_data[["CO₂ Concentration"]].values  # CO₂ concentration as input
y = merged_data["Temperature Change"].values   # temperature change as target

model = LinearRegression()
model.fit(X, y)

# function to simulate "what-if" scenarios
def simulate_temperature_change(co2_percentage_change):
    # Calculate new CO2 concentrations
    current_mean_co2 = merged_data["CO₂ Concentration"].mean()
    new_co2 = current_mean_co2 * (1 + co2_percentage_change / 100)

    # predict temperature change
    predicted_temp = model.predict([[new_co2]])
    return predicted_temp[0]

# simulating scenarios
scenarios = {
    "Increase CO₂ by 10%": simulate_temperature_change(10),
    "Decrease CO₂ by 10%": simulate_temperature_change(-10),
    "Increase CO₂ by 20%": simulate_temperature_change(20),
    "Decrease CO₂ by 20%": simulate_temperature_change(-20),
}

scenarios
{'Increase CO₂ by 10%': 1.0866445037958163,
'Decrease CO₂ by 10%': -0.059993041237237144,
'Increase CO₂ by 20%': 1.6599632763123422,
'Decrease CO₂ by 20%': -0.6333118137537621}

A 10% increase in CO₂ results in a notable rise in temperature anomalies, which demonstrates the sensitivity of global temperatures to CO₂ levels. Conversely, a 10-20% reduction in CO₂ could lead to significant cooling effects, which will potentially reverse some warming trends.

Summary

Our analysis highlights a strong positive correlation between rising CO₂ concentrations and global temperature anomalies, with CO₂ levels increasing at a faster rate than temperature changes. Time-series and clustering analyses reveal clear trends of escalating emissions driving temperature increases, while seasonal variations underscore the moderating role of natural carbon sinks. Lagged effects suggest that current CO₂ levels have the most significant impact on temperature changes, with diminishing influence from past emissions. Simulating “what-if” scenarios demonstrate the sensitivity of global temperatures to CO₂ levels, which emphasizes that even modest reductions in emissions could significantly mitigate global warming. These findings underline the urgent need for actionable policies to address climate change effectively.

I hope you liked this article on Carbon Emissions Impact Analysis with Python. Feel free to ask valuable questions in the comments section below. You can follow me on Instagram for many more resources.