DETERMINATION OF TREND OF WEATHER PARAMETERS, CLIMATE EXTREMES, ANOMALIES AND FORECASTING OF NEXT YEAR TREND BY AI/ML ANALYSIS.
Research Objective. Key Highlights: Station under analysis ‘Alipore, station code 42807’, period of analysis is 1969 to August 2025. The data is obtained from online data supply platform IMD PUNE. 🌡️ Trends: Analyse over historical years of temperature, precipitation, and sunshine patterns. ⚠️ Extremes: Detect heatwaves, cold spells, and heavy rainfall events. 🔮 Forecasting: Use Model Prophet to predict temperature and precipitation for the next year. 🛠️ Machine Learning Features: Generate advanced features for ML models. 📈 Visuals: Visualization of heat maps, trends, and scatter plots to uncover relationships. 🔗 Comparisons: save derived weather file obtained from analysis for the purpose of comparison with other research area..
Descriptive Statistics and Trends Analysis To determine descriptive statistics of the dataset . To find total count , mean, standard deviation, minimum value, quartile and maximum value of each parameter. Mean temperature is defined by the mean of the parameters maximum and minimum. Based on historical data analysis, to determine the annual trend of mean temperature , monthly average precipitation. To understand whether any climate change indication ,any drought or heavy rainfall. Analyse seasonal and annual average. Here the pictures for annual mean temperature trend and monthly average precipitation..
The figures are annual sunshine duration trend, precipitation outlier by season, average mean temperature by season, average maximum temperature by season..
2. Time-Series Analysis The second part deals with decomposition of weather parameters into trend, seasonality, and residual to understand cyclical changes . Trend gives insights of long term change of weather parameters, seasonality reveals repetitive short term pattern and residual captures irregular fluctuations that can not be explained by trend and seasonality. Another analysis with this part dealt with rolling averages. Calculating rolling mean or averages means taking the average of a fixed number of consecutive data points in a time series and then sliding that window forward one step at a time to get a new average. To smooth out daily fluctuations and identify long term trends rolling averages are calculated. Anomaly detection is to detect unusual weather events like heatwaves, cold spells, or sudden increase in precipitation..
Here are pictures of rolling averages of mean temperature, anomaly detection in daily precipitation, anomaly detection in daily maximum temperature, anomaly detection in daily minimum temperature. For daily precipitation, the anomaly is considered as more than upper threshold level that is mean+ 3*standard deviation. The anomaly is considered accordingly. Similarly for maximum temperature, temperature more than the upper threshold level is considered as anomaly and for minimum temperature, temperature less than lower threshold that is mean-3*standard deviation is considered as anomaly..
Here are pictures of anomaly detection in daily relative humidity upper level , anomaly detection in daily relative humidity lower level, monthly sunshine duration (average), monthly maximum temperature (average).The threshold level for relative humidity upper and lower level is considered accordingly with the equation mean plus(minus)3*standard deviation..
The third part of analysis is related with correlation and relationship between different significant weather variables all numerical in nature. Plot scatter plot between sunshine temperature and mean temperature . Using seaborn pair plot to examine pairwise relationships between key variables. Where the key variables are daily mean temperature TG, daily rainfall RF, daily relative humidity RH, daily sunshine temperature SSH, daily total amount of cloud octa TC . Group data by seasons and calculate correlations for seasonal analysis . Dividing the whole year into four Meteorological seasons ,month January February as winter, months march to may as pre- monsoon ,months june to September as monsoon and months october to December as post monsoon. Group by season and calculated correlation between key variables accordingly. Plot correlation matrix between key variables for pre-monsoon months..
This slide is in continuation with previous slide. As described there ,these are the figures of pair plot for key weather variables to describe pairwise relationships. Another figure is for correlation matrix for pre monsoon..
4. Climate Extremes and Outliers In the fourth part of analysis, firstly preview the data ,then defining the threshold for heatwaves and cold spells . Considered threshold of heatwaves as 95 th percentile of the maximum temperature and threshold of minimum temperature as 5th percentile of minimum temperature. Then filtered data for heatwaves as greater than heatwave threshold and cold spells as lower than cold spells threshold. Thus print number of heatwaves days and coldspells days. Plot heatwaves and coldspells events .Then similarly defining a threshold for heavy rainfall, considering 95th percentile the upper boundary of threshold, print number of days of heavy rainfall and plot the data. Using boxplots to identify outliers for key weather variables. Then plot Yearly summary of extremes . Group by year and count the number of extreme events. Then plot the summary of climate extremes with number of extreme events. In this manner, total number of heatwave days in the total data set is 1035. The number of cold spells days is 1058. The number of heavy rainfall days is 1018..
5. Seasonal Analysis In the fifth part of analysis, to study seasonal variations in temperature, precipitation, cloud cover and sunshine. Calculating mean group by each season plot box plot for seasonal average of key weather variables. Determination Seasonal trends in sunshine duration SSH ,data group by season and plot the trend. Distribution of precipitation across seasons. Seasons defined by Meteorological seasons. Considering upper boundary of threshold for extreme precipitation as more than 95th percentile ,determined extreme precipitation days group by season and plot the data ..
6. Visualization The 6th part of analysis provides various visualizations for weather trends and distributions. Line Plot data of maximum temperature, minimum temperature, mean temperature to visualise the trend of these three weather parameters. Then plot histograms of key weather variables. The key weather variables are maximum, minimum, mean temperature, rainfall, sunshine hours , relative humidity. The first figure is for daily temperature trend for maximum temperature, minimum temperature and mean temperature. The other images are distribution of rainfall ,sunshine hour , maximum and minimum temperature ..
This slide is in continuation with the previous slide. Box plot to visualise outliers, heat map to understand correlation matrix ,relationship between key variables. Then data group by month and taking the mean of the parameters maximum, minimum, mean temperature and then plot monthly average rainfall. This slide is of plots with the monthly trend of these parameters..
7. Forecasting The 7th part of analysis focuses on forecasting future temperature and precipitation trends using the prophet library. Date column converted into date time format . Prophet requires a data frame With column ds ,date time and y ,value to predict. Renaming Date column as ds and mean temperature TG as y . Then initialize the prophet model and fit the data .Then created a data frame for future predictions, next 365 days ,plot the forecasted mean temperature for next year. Similarly in the same manner, created data frame for future prediction, next 365 days for precipitation by model Prophet. Plot the components of the temperature forecast and plot the components of the precipitation forecast. save the forecasted data to CSV files. Similarly plot the forecasted data for next 365 days for maximum temperature..
8. Comparing Climate Change Indicators The 8th part of analysis consists of the comparison between climate change indicators . To understand the impact of climate change, initially set date into index as previously done and divided the total data year into two parts . Defined 1979 to 1999 as early period and 2000 to 2025 as recent period . Considered the key variables as parameters of indicator. Then calculated means of these indicator for early period and recent period. Then plot the comparison data to visualise the change of mean of these vital weather parameters. Then resampled data annually, calculated the mean of indicator parameters and then plot the annual trend of climate indicators ..
Next considered the parameter 'range' where range is defined as difference between maximum temperature and minimum temperature . Then resampled the parameter yearly and calculated mean value to visualise the yearly mean of temperature range. Plot annual trend for precipitation and sunshine resampling data with yearly mean to understand relationship between the annual trend of sunshine duration and precipitation. Then Comparing anomaly for mean temperature TG and maximum temperature. The early period as described earlier ,the period 1979 to 1999 and the mean temperature for that early period is considered as baseline mean temperature. The difference between the mean temperature TG and the base line mean temperature is considered as anomaly. Thus plot the temperature anomaly . Just similar pattern of calculation for maximum temperature and maximum temperature anomaly and plot maximum temperature anomaly ..
9. Feature Engineering for Machine Learning Various derived parameters thus obtained during feature engineering and python analysis ,such as temperature range, rolling statistics features ,such as rolling mean and standard deviation for 7 days and 30 days window ,lag features for temperature and precipitation for 1,3,7 day ,the cyclical ,seasonal encoding for understanding of cyclical and seasonal nature ,define new feature column with extreme events such as heatwave, cold spells, heavy rainfall ..
Thanks For Watching.