Algroveon-Mini-SSG – Technical Architecture of the Generator

Introduction

The developer diary article told the story behind Algroveon-Mini-SSG. This is no longer about its creation, but about the technical foundation: How is the system structured? Why was the template system designed exactly this way? How does the translation engine work in detail? And why is the Admin UI server-side rendered instead of built as a React app? That is exactly what this is about.

Architecture Overview

sites/<site>/
  site.yaml          ← Configuration, navigation, theme assignment
  content/<lang>/    ← Markdown content, sorted by language
  static/            ← CSS, images, fonts
  templates/         ← Site-specific template overrides (optional)
  _site/             ← Build output (pure HTML)

themes/<theme>/      ← Shared Jinja2 templates (cross-site)

build.py             ← Build engine
admin.py             ← Flask-Admin server
translate.py         ← LLM translation engine

The separation between themes/ and sites/ is intentional. A theme can be shared by multiple sites. Site-specific customizations reside in sites/<site>/templates/ and override the theme without needing to touch the original. This is resolved via the Jinja2 ChoiceLoader: it first searches the site-specific folder, then the theme.

The Template System

Layouts via Frontmatter

Every Markdown file has a layout: field in its frontmatter:

---
layout: post
title: "My Article"
---

build.py finds the appropriate template (post.html) using the ChoiceLoader. This allows any number of layouts to be used per site – post, project, page, home, blog-index, 404, and more – each as simple Jinja2 HTML files. This is intentionally kept simple, and that is exactly why it is flexible.

Template Context

Each template receives two objects:

page – the rendered document: page.title, page.content (HTML), page.date, page.stack, and all frontmatter fields.
site – the site configuration: site.title, site.nav, site.url, site.author.

Additionally, all_pages (all built pages as a list) and collections (pages grouped by tags) are available. This is particularly useful for overview pages, archive pages, or sitemaps.

Site-Specific Overrides via the Admin

In the Design tab of the Admin UI, templates can be edited directly in the browser. When saving, the Admin UI writes the modified file to sites/<site>/templates/<template>.html. During the next build, this file takes precedence over the theme original. There is no special magic behind this, just the ChoiceLoader searching local site paths first.

Multilingualism: URL Structure and Fallback

URL Schema

URLs follow the schema /<lang>/path.html, for example /de/blog/mein-artikel.html and /en/blog/my-article.html. This is not merely a convention, but a conscious architectural decision: both language versions exist as independent HTML files. There is no client-side language switching and no JavaScript routing. This is exactly what keeps the system clear, traceable, and robust.

Fallback Logic

If a translation is missing, the build falls back to the source file instead of generating a 404 page. This is controlled via a source_hash field in the frontmatter, which originates from the translation process.

Date Parsing

Filenames following the pattern 2024-11-17-article.md are automatically recognized as dates. Frontmatter date fields take precedence. The parse_date utility supports, among others:

ISO: 2024-11-17
European: 17.11.2024, 17-11-2024
British/American: 17/11/2024, 11/17/2024
Long text DE: 17. November 2024
Long text EN: November 17, 2024 / 17 November 2024

Month names are constructed from locales/*.yaml. This makes adding new languages relatively easy: add a locale file, and you're done.

The Translation Engine

Provider Detection

translate.py supports three providers, which are automatically detected based on the configured API URL:

Ollama (local) – Port 11434 or no URL → native Ollama API (/api/chat, /api/generate)
OpenAI – openai.com in the URL → openai package, direct API
OpenAI-compatible – everything else → openai package with its own base_url (for example Mistral, LM Studio, or local endpoints)

Configuration is stored in configuration.yaml (versioned) and configuration.local.yaml (gitignored, for API keys and internal URLs). The local file overrides the base configuration.

Block-wise Translation

The content is not sent to the model as a whole. Instead, the Markdown body is split into blocks and translated block by block. In practice, this is much more controllable and offers several advantages:

the context remains manageable, which usually leads to more stable results for longer articles
code blocks (enclosed in ```) are skipped and not translated
frontmatter fields (title, description, and all fields defined in translate_fields) are processed separately
KEEP_FIELDS (for example layout, status, stack, date) remain unchanged

Obsolescence Detection via SHA-256

Every translated file gets a source_hash field entered into its frontmatter:

source_hash: a3f7c12e4b8d9e01

This is a SHA-256 hash of the source file body plus all translated frontmatter fields, stored as the first 16 hex characters. If the source file changes, translate.py recognizes the translation as outdated during the next run and only re-translates that specific file. Files with an unchanged hash are skipped. This keeps the process efficient without becoming unmanageable.

The --rehash flag writes the hash into existing files without triggering an LLM call. This is particularly useful after manual corrections when a translation should be explicitly marked as current.

No-Thinking Mode

Reasoning-capable models like Qwen3, QwQ, or DeepSeek-R1 often generate a <think> block before the actual response. For pure translation tasks, this provides no real added value in practice but can significantly extend runtime. Therefore, the --no-thinking flag sends a system instruction to suppress this behavior. Online APIs that cannot process this simply ignore the option.

VRAM Management with Ollama

If multiple models are loaded on a local Ollama instance, translate.py requests a VRAM cleanup before the first translation call: all models except the configured translation model are unloaded (keep_alive: 0). This is especially helpful on systems with limited VRAM to avoid out-of-memory errors.

Live Output via SSE

The translation process runs as a subprocess from the Admin UI. The output is streamed to the browser line by line via Server-Sent Events (SSE). A keepalive ping keeps the connection open while the model is working. The practical effect is simple: the browser shows the progress and the translated sections directly without polling. This makes a noticeable difference, especially during longer runs.

The Admin UI: SSR instead of SPA

The Decision

The admin interface is server-side rendered based on Flask, supplemented by Vanilla JavaScript for the interactive parts. This means no React app, no Vue, and no additional frontend build step.

This was a conscious decision. An SSR application is easier to debug, faster to develop, and more robust against the typical overhead of modern frontend frameworks, which regularly bring new dependencies and breaking changes. An admin interface that I use myself does not require a full-fledged SPA architecture.

The trade-off is clear: the interface does not react quite as instantly as an SPA. In practice, however, the combination of fetch and targeted DOM updates is perfectly sufficient for the important areas—such as the file explorer, the editor content, or the build output stream.

Security Headers

The Admin UI is not intended for public internet access, but for use within a local network. Nevertheless, conservative HTTP security headers are set:

X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Content-Security-Policy – only own resources, no external CDNs. Fonts and icons are provided locally.

Path Traversal Protection

All file operations—reading, writing, and deleting—run through safe_path(). This function resolves the path and checks via relative_to() whether it lies within the project folder. Attempts to escape the project directory via ../../ are intercepted.

Deploy: MD5 Sync, FTP, and SFTP

The deployment process works incrementally. Before uploading, an MD5 hash is calculated for each local file. On the remote server, the existing files are also hashed. Only files whose hashes differ are uploaded. This significantly reduces upload time and data volume, especially for sites with few changes.

SFTP runs via paramiko. The password is stored in the OS keyring (keyring) – in the Keychain on macOS, and in the Secret Service on Linux. This ensures it is not stored in plain text in a configuration file.

FTP is additionally available via the standard library (ftplib), meaning no further dependencies are required. This is particularly relevant for hosts that still do not offer SFTP.

A Connection Test button in the settings checks reachability and credentials before the actual deployment. This saves unnecessary troubleshooting if the credentials are simply incorrect.

Conclusion

Algroveon-Mini-SSG is kept intentionally simple in many places, but is more technical in others than it might appear from the outside. The translation engine, with its hash-based obsolescence detection, automatic provider detection, and VRAM management, is more than just a small helper script. Likewise, the template system with ChoiceLoader and site-specific overrides is not a coincidence, but was deliberately built this way. In my view, the decision for SSR in the Admin UI was also the right one.