Since the announcement of Pocket’s shutdown I’ve been looking for a replacement and I’ve settled on linkding.
After setting it up on my home server I went over to Pocket’s website to export my saved data so that I could migrate all of it.
The exported data consists of a .csv
file containing all my saved links and
now I needed to convert it to a bookmarks.html
file since that’s what linkding uses as import format.
My first instinct was to find some online tool that would take care of the conversion and the one I found that look suitable was csv-to-bookmarks.glitch.me.
It did convert the .csv
just fine but then I noticed something. linkding will
read the tags
field as a comma separated list of strings but the Pocket export
has them separated by |
.
I spent some time wrangling the files but it was not exactly reproducible and since the migration process might be useful for other people I decided to write a little python script that takes care of that:
import csv
import sys
csv_filename = sys.argv[1]
csv_data = open(csv_filename, 'r')
csv_reader = csv.reader(csv_data, strict=True)
next(csv_reader, None) # skip the header
html_output = '<!DOCTYPE NETSCAPE-Bookmark-file-1>\n'
html_output += '<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">\n'
html_output += '<TITLE>Bookmarks</TITLE>\n'
html_output += '<H1>Bookmarks</H1>\n'
html_output += f'<H3>{sys.argv[1]}</H3>\n'
html_output += '<DL><p>\n'
for row in csv_reader:
url = row[1]
added_date = row[2]
tags = row[3].replace('|', ',')
title = row[0].strip()
html_output += f'<DT><A HREF="{url}" ADD_DATE="{added_date}" TAGS="{tags}">{title}</A>\n'
html_output += '</DL><p>'
print(html_output)
Save it to convert.py
and invoke it like so:
python convert.py pocket_export.csv > bookmarks.html
On digital hoarding and link rot
I’ve been a prolific Pocket user for the past 13 years (this is the first page I ever collected) with 3688 saved links and a nice chunk of those still to be read.
Hoarders gonna hoard I guess.
I decided to run lychee over the bookmarks file so I could see how far the link rot had spread.
lychee --threads 1 --timeout 10 --accept 200..=206,403 bookmarks.html
📝 Summary
---------------------
🔍 Total.........3688
✅ Successful....3282
⏳ Timeouts........42
🔀 Redirected.......0
👻 Excluded.........3
❓ Unknown..........8
🚫 Errors.........353
89% of URLs still active is pretty good!
I was initially disheartened when the failed count was much higher but it was most likely to lychee being blocked so what you see above is the report after adjusting for that.