Week 51 Changelog
Added
- Added new scripts for content and community workflows:
scripts/weekly_changelog_llm.pyfor generating weekly changelogs using an LLMscripts/gen_newsletter.pyfor newsletter generationscripts/generate_social_posts.pyfor creating social media posts
- Added new utility code in the comment crawler, including
book_link_utils.pyand an updated implementation ofcheck_recent_book_links.py. - Added a new version of the bulk indexer script as
scripts/bulk_index_new.py.newfor future indexing improvements. - Added
scripts/latest_social_posts.jsonto support social post generation.
Changed
- Moved existing maintenance scripts (
backfill_authors_books.pyandbulk_index_new.py) into a dedicatedscripts/directory to improve project organization. - Updated crawler-related files, including
hnb-comment-crawler/hn_crawler.pyand the crawler crontab configuration (hnb-book-crawler/crontab.crawler). - Tweaked
.gitignoreto reflect the current project structure and remove obsolete entries.
Fixed
- Improved comment crawler behavior by updating
hn_crawler.pyand replacing the oldcheck_recent_book_links.pywith a cleaner implementation. - Adjusted crawler scheduling and automation via changes to
hnb-book-crawler/crontab.crawlerso recent book links stay fresh.
Infra/CI
- Updated cron jobs and Docker-related automation for the book crawler (
hnb-book-crawler/crontab.crawlerand related files). - Removed legacy deployment and maintenance scripts (various old
.pyand.shutilities) to streamline infrastructure. - Cleaned up large, outdated data dumps (
all_books.txt,all_urls_before_update.txt,have_ids.txt,test_books.txt) and other obsolete files (link_update_log.jsonl, etc.) to reduce clutter and keep the repo lean.
Next up: Fix Bluesky logo