ics Event Parser and Change Notifier

Felix Kohlhas
Tags: Scripts Python
A python script to download multiple ics calendars, merge, deduplicate them and sort events into mandatory and elective events. Also generates a diff to summarize changes.
The following description and code were created in collaboration with ChatGPT.

This Python script downloads iCalendar data from multiple URLs, categorizes the events as either mandatory or elective based on the event summary, and writes the data to separate iCalendar files. It then compares the events to a previous version, creates an HTML diff highlighting the differences, and saves the updated events to a file.

Here’s a breakdown of the script:

  1. Import necessary modules: requests, icalendar, datetime, timezone, re, os, and difflib.

  2. Define the timezone and start date for the calendar.

  3. Define the file paths for the input URL list, mandatory and elective iCalendar files, events file, and diff file.

  4. Read in the URLs from the file URLs.txt.

  5. Create new icalendar.Calendar() objects for the mandatory and elective events and a combined calendar for all events.

  6. Create a set to keep track of previously seen events. (In order to remove duplicates, such as bank holidays existing in all calendars)

  7. Loop through each iCalendar URL and download its contents. For each iCalendar object, loop through its events and add them to the appropriate calendar based on whether they are mandatory or elective. (Mandatory events are marked with an asterisk.)

  8. Sort the events in the mandatory, elective, and combined calendars by start time.

  9. Define a function format_events() to format the events in the iCalendar object as a string.

  10. Define a function save_calendar() to save the iCalendar object to a file.

  11. Save the mandatory and elective iCalendar objects to their respective files.

  12. Format all events as a string using the format_events() function.

  13. Check if the events file exists. If it does, read its contents, compare them to the new events, and create an HTML diff file if there are any differences.

  14. Save the updated events string to the events file.

The resulting files are hosted on a webserver and can be subscribed to by URL. The script gets run every night using a cronjob and if there are changes the output of the script will be sent by email.

get_calendar.py [file]

#!/usr/bin/python3

import requests
import icalendar
from datetime import datetime, timedelta, timezone
from pytz import timezone
import re

import os
import difflib
import html

timezone = timezone('Europe/Berlin')
start_date = datetime(2023, 3, 5, tzinfo=timezone)

file_urls = "urls.txt"
file_mandatory = "mandatory.ics"
file_elective = "elective.ics"
file_events = "events.txt"
file_diff = "diff.html"

with open(file_urls, "r") as f:
    ical_links = f.read().splitlines()
    
# Create new calendar objects to hold the mandatory and elective events
mandatory_calendar = icalendar.Calendar()
elective_calendar = icalendar.Calendar()
combined_calendar = icalendar.Calendar()

# Create a set to keep track of previously seen events
seen_events = set()

# Loop through each iCalendar link and download its contents
for link in ical_links:
    response = requests.get(link)
    if response.status_code == 200:
        # Parse the downloaded iCalendar data into an iCalendar object
        calendar_data = response.content.replace(b"RDATE:", b"")
        calendar_data = re.sub(b"UID.*",b"", calendar_data)
        parsed_calendar = icalendar.Calendar.from_ical(calendar_data)

        # Loop through each event in the iCalendar object and add it to the appropriate calendar
        for event in parsed_calendar.walk('VEVENT'):
            # Get summary
            summary = str(event.get('summary'))
            
            if event.get('dtstart').dt < start_date:
                continue
            
            # Skip if this event has already been seen
            if (event.get('dtstart').dt, event.get('dtend').dt, summary) in seen_events:
                continue
            
            # Mark event as seen
            seen_events.add((event.get('dtstart').dt, event.get('dtend').dt, summary))
            
            combined_calendar.add_component(event)
            
            # Determine whether the event is mandatory or elective based on the summary
            if summary.endswith('*'):
                # Add a 30-minute alert to mandatory events
                alarm = icalendar.Alarm()
                alarm.add('trigger', timedelta(minutes=-30))
                alarm.add('action', 'DISPLAY')
                event.add_component(alarm)
                
                mandatory_calendar.add_component(event)
            else:
                elective_calendar.add_component(event)

# Sort the mandatory and elective calendars' events by start time
mandatory_calendar.subcomponents = sorted(mandatory_calendar.subcomponents, key=lambda component: component.get('dtstart').dt)
elective_calendar.subcomponents = sorted(elective_calendar.subcomponents, key=lambda component: component.get('dtstart').dt)
combined_calendar.subcomponents = sorted(combined_calendar.subcomponents, key=lambda component: component.get('dtstart').dt)

def format_events(calendar):
    event_strings = []
    for event in calendar.walk('VEVENT'):
        # Get the start and end times in the correct timezone
        start_time = event.get('dtstart').dt.astimezone(timezone).strftime('%Y-%m-%d %H:%M:%S')
        end_time = event.get('dtend').dt.astimezone(timezone).strftime('%H:%M:%S')
        
        # Get the summary and location
        summary = str(event.get('summary'))
        location = str(event.get('location'))
        
        # Format the event string
        event_string = f"{start_time} - {end_time} {summary} ({location})"
        event_strings.append(event_string)
        
    return "\n".join(event_strings)

def save_calendar(calendar, file_path):
    # Write the calendar to a file
    with open(file_path, 'wb') as f:
        f.write(calendar.to_ical())
        
save_calendar(mandatory_calendar, file_mandatory)
save_calendar(elective_calendar, file_elective)

all_events_string = format_events(combined_calendar)

# Check if events.txt file exists
if os.path.isfile(file_events):
    # Read the contents of events.txt
    with open(file_events, 'r') as f:
        events_file_string = f.read()

    # Highlight the differences between events.txt and all_events_string
    if not events_file_string == all_events_string:
        mtime = os.path.getmtime(file_events)
        mdate = datetime.fromtimestamp(mtime).strftime('%Y-%m-%d %H:%M:%S')
        today = datetime.today().strftime('%Y-%m-%d %H:%M:%S')
        
        diff = difflib.unified_diff(events_file_string.splitlines(), all_events_string.splitlines(), mdate, today, n=0)
        
        # Format the diff output as an HTML table with color highlighting
        diff_output = "<meta charset='UTF-8'><table style='font-family: monospace;'>"
        for line in diff:
            print(line)
            if line.startswith("+"):
                diff_output += f"<tr><td style='background-color:lightgreen'>{html.escape(line)}</td></tr>"
            elif line.startswith("-"):
                diff_output += f"<tr><td style='background-color:tomato'>{html.escape(line)}</td></tr>"
            else:
                diff_output += f"<tr><td>{html.escape(line)}</td></tr>"
        diff_output += "</table>"
        with open(file_diff, 'w') as f:
            f.write(diff_output)

# Save all_events_string to events.txt
with open(file_events, 'w') as f:
    f.write(all_events_string)

Cron job

0 7 * * * cd /srv/www/calendar/ && ./get_calendar.py

Example Output

--- 2023-05-09 08:00:03

+++ 2023-05-10 08:00:03

@@ -136 +135,0 @@

-2023-05-22 14:30:00 - 16:00:00 V Neuro Tumor ZNS IV (HS Kopf; INF 400)
@@ -156,0 +156 @@

+2023-06-13 08:00:00 - 09:30:00 V Neuro Tumor ZNS IV (HS Kopf; INF 400)
--- 2023-04-03 08:00:03

+++ 2023-04-04 08:00:03

@@ -65 +65 @@

-2023-04-17 10:45:00 - 12:15:00 VL Päd Fieber (00.264, EG; INF 440)
+2023-04-17 10:45:00 - 12:15:00 VL Päd Kommunikation (00.264, EG; INF 440)
@@ -68 +68 @@

-2023-04-18 11:15:00 - 12:45:00 VL Päd Gastro (00.264, EG; INF 440)
+2023-04-18 11:15:00 - 12:45:00 VL Päd Fieber (00.264, EG; INF 440)
@@ -70 +70 @@

-2023-04-20 10:15:00 - 11:45:00 VL Päd Kommunikation (00.264, EG; INF 440)
+2023-04-20 10:15:00 - 11:45:00 VL Päd Gastro (00.264, EG; INF 440)
@@ -75,0 +76 @@

+2023-04-25 12:00:00 - 13:30:00 VL Päd Neurgeborenes (00.264, EG; INF 440)
@@ -79 +79,0 @@

-2023-04-28 12:00:00 - 13:30:00 VL Päd Neurgeborenes (00.264, EG; INF 440)
@@ -86 +86 @@

-2023-05-05 10:15:00 - 11:45:00 VL Päd Infekt NG (00.264, EG; INF 440)
+2023-05-05 11:45:00 - 13:15:00 VL Päd Infekt NG (00.264, EG; INF 440)

Summary

For a more human readable version of the changes, ChatGPT can be asked to summarize the changes, yielding this:

The following changes were made to the calendar:

  • On April 17, the event “VL Päd Fieber” was changed to “VL Päd Kommunikation” from 10:45 AM to 12:15 PM.
  • On April 18, the event “VL Päd Gastro” was changed to “VL Päd Fieber” from 11:15 AM to 12:45 PM.
  • On April 20, the event “VL Päd Kommunikation” was changed to “VL Päd Gastro” from 10:15 AM to 11:45 AM.
  • On April 25, a new event “VL Päd Neurgeborenes” was added from 12:00 PM to 1:30 PM.
  • On April 28, the event “VL Päd Neurgeborenes” was removed.
  • On May 5, the event “VL Päd Infekt NG” was changed from 10:15 AM to 11:45 AM to 11:45 AM to 1:15 PM.