This week I mainly focused on researching viable ML models for renewable energy forecasting as well as finding data sources. From my research, i was able to find multiple papers and surveys comparing methods for renewable forecasting. Based on these papers, I was able to narrow my list of algorithms that I’ll start experimenting with to SVM, RNN, ARIMA, and more. From my search for data sources I found global weather data for day-ahead forecasting and model training as well as solar radiation datasets for model validation from the National Solar Radiation Database. Data for wind turbine generation proved more difficult to find for multiple locations, but I was able to find some Kaggle datasets for turbines in Europe that could be used for validation purposes. These sources are linked at the bottom of this post.
My progress is on schedule so far. For next week I plan on setting up a remote repository for work on renewable forecasting and develop a detailed outline for our ML pipeline including plans for data scraping, preprocessing, training, validation, and testing. I intend on focusing first on SVMs as the favorite of the algorithms I found, and I’ll look to make a visual representation of our full ML pipeline.
Weather API: https://openweathermap.org/api
NSRB Solar Datasets: https://nsrdb.nrel.gov/data-viewer
Kaggle Wind Dataset: https://www.kaggle.com/datasets/berkerisen/wind-turbine-scada-dataset