bendun.cc

Proper Timestamps

Written , a 3 minute read

The most common form of timestamps on the web is datestamps - where you only specify year, month and day of event. It's so common that without thinking I replicated it on my website. Perhaps this was a mistake.

The problem with dates is the same as with time - timezones ruin everything. If my and yours timezone don't match nicely you can see my post as written in the future or even not see it at all if your reader throws all entries that have invalid dates (= dates that didn't happen yet according to local time). To see more reasons check out Lost in Time by Chris Burnell, an article that inspired me to improve dates on my site. There is one problem though.

Time of publication doesn't matter

My publication times are public since code for this is on Github. However, they don't match what is written next to post, since those dates are for creation context and could be committed to the public anytime after they are written. With this in mind, when the post is finished? When I write the last sentence there could be already well after midnight but this post feels like it belongs to the previous day, not the new one. It hasn't been written with the new day mood or circumstances so it doesn't belong to the new day.

While the problem above is mostly overthinking, one that is more tangible is how to mark older posts, that don't have precise timestamps. If I would use commit times they sadly don't match:

Declared Commited
2024-09-052024-09-05
2024-08-312024-08-31
2024-08-262024-08-28
2024-08-182024-08-18
2024-06-192024-06-19
2024-06-122024-06-12
2024-06-062024-06-06
2024-05-312024-05-31
2024-05-212024-05-21
2024-05-182024-05-18
Declared Commited
2024-05-062024-05-06
2024-04-172024-04-17
2024-04-172024-04-17
2024-04-142024-04-15
2024-03-242024-03-24
2024-03-092024-03-10
2024-03-032024-03-03
2024-02-262024-02-29
2024-02-252024-02-29

So for those ones, I will go with the assumption of 23:59 at my local time.

With some terrible Python scripting, I updated all of the dates, including ones in archive index by changing the source to the posts itself (using more terrible Python code):

class Post:
  @property
  @functools.cache
  def date(self) -> datetime:
    with open(self.path, 'r') as file:
      html = file.read()
    return datetime.fromisoformat(
      re.search(r'<time datetime="([0-9:T+-]+)">[0-9-]+<\/time>', html).group(1))

Addendum: terrible Python scripting

I've written my archive list generator in Python, which makes it quite easy to use list of posts from it for some quick data gathering or manipulation. All I need to do is import index to have list of all my published posts.

import datetime
import pathlib
import shlex
import subprocess
import zoneinfo

import index

def git_creation_date(path: str) -> str | None:
  cmd = shlex.split('git log --diff-filter=A --follow --format=%aI -1 --')
  return subprocess.check_output((*cmd, path), text=True).strip()

def main():
  tz = zoneinfo.ZoneInfo('Europe/Warsaw')

  for post in index.POSTS:
    git = git_creation_date(post.path)
    declared = post.date.strftime('%Y-%m-%d')
    commited = git.split('T')[0]

    post.date = post.date.replace(hour=23, minute=59, tzinfo=tz) \
      if declared != commited \
      else datetime.datetime.fromisoformat(git).replace(tzinfo=tz)

    assert post.date.strftime('%Y-%m-%d') == declared

    with open(post.path, 'r') as f:
      html = f.read()

    search = f''
    assert html.find(search) >= 0, f"post {post.path}"
    html = html.replace(search,
      f'<time datetime="{post.date.isoformat()}">{declared}</time>', 1)

    with open(post.path, 'w') as f:
      f.write(html)

if __name__ == "__main__":
    main()

I love that in Python you often create both the best and the worst code ever seen.