Files
recipe-importer/README.md
T
admin 458b1e362a feat: Tandoor integration — settings, test connection, import, duplicate detection
Add TandoorClient (app/tandoor.py) with full recipe creation, image upload,
and duplicate detection via the Tandoor REST API. Settings page now has
separate Mealie and Tandoor sections. Import page shows both send buttons
based on which services are configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 09:29:58 +01:00

162 lines
7.2 KiB
Markdown

# Recipe Importer
Docker container for importing recipes from Hungarian websites into [Mealie](https://mealie.io/) and [Tandoor Recipes](https://tandoor.dev/).
**Problem**: Mealie's and Tandoor's built-in URL import cannot parse ingredients and instructions from Hungarian recipe sites like mindmegette.hu.
**Solution**: This container provides a web UI that scrapes Hungarian recipe pages with site-specific parsers, lets you review and edit the extracted data, then pushes it to Mealie and/or Tandoor via their REST APIs.
## Architecture
```
┌──────────────────────────────────────────────────────┐
│ recipe-importer container (:8000) │
│ │
│ Flask + Gunicorn │
│ ├── /settings → Configure Mealie & Tandoor │
│ ├── /import → Paste URL, scrape, review │
│ ├── /scrape → AJAX: parse recipe HTML │
│ ├── /send → AJAX: push to Mealie API │
│ ├── /send-tandoor → AJAX: push to Tandoor API │
│ └── /health → Health check │
│ │
│ Modules: │
│ ├── app/config.py → JSON config persistence │
│ ├── app/scraper.py → Site-specific parsers │
│ ├── app/mealie.py → Mealie REST API client │
│ └── app/tandoor.py → Tandoor REST API client │
└───────────────────┬──────────────┬───────────────────┘
│ HTTP │ HTTP
▼ ▼
┌──────────────┐ ┌───────────────┐
│ Mealie │ │ Tandoor │
│ POST /api/.. │ │ POST /api/.. │
│ PUT /api/.. │ │ PUT /api/.. │
└──────────────┘ └───────────────┘
```
## Supported Sites
| Site | Ingredients | Instructions | Image |
|------|:-----------:|:------------:|:-----:|
| mindmegette.hu | Yes | Yes | Yes |
| *Other sites* | Fallback (schema.org JSON-LD) | Fallback (schema.org JSON-LD) | Yes (og:image) |
### Mindmegette.hu Parser
Extracts data from the Angular-rendered HTML:
- **Title**: `og:title` meta tag, with ` | Mindmegette.hu` suffix stripped
- **Description**: `og:description` meta tag
- **Image**: `og:image` meta tag
- **Ingredients**: `div.ingredients``div.ingredients-meta` rows, each containing `<strong>` (qty), `<span>` (unit), `<a class="ingredients-link">` (food), `<small>` (extra)
- **Ingredient groups**: Multiple `div.ingredients` containers; group title via `<strong class="ingredients-group">`
- **Instructions**: `mindmegette-wysiwyg-box``ol > li` elements
### Generic Fallback Parser
For unsupported sites, attempts extraction via:
1. Schema.org JSON-LD `@type: Recipe` blocks (`recipeIngredient`, `recipeInstructions`)
2. OpenGraph meta tags for title, description, image
### Adding a New Site Parser
1. Create a parser function in `app/scraper.py` with the `@_register("hostname")` decorator
2. The function receives `(soup: BeautifulSoup, url: str)` and returns the standard recipe dict
3. The hostname substring is matched against the URL — first match wins, unmatched URLs use the generic fallback
## Mealie API Integration
The importer uses the Mealie REST API:
1. **POST** `/api/recipes` — create a stub recipe (returns slug)
2. **PATCH** `/api/recipes/{slug}` — populate structured ingredients (with unit/food IDs), instructions, description, orgURL
3. **PUT** `/api/recipes/{slug}/image` — upload the recipe image
**Structured ingredients**: The client resolves unit and food names to Mealie database IDs. Missing units/foods are created automatically via the API. Ingredient groups are supported via the `title` field on the first ingredient of each group.
Authentication uses a long-lived API token (Bearer header), created in Mealie at *Profile → API Tokens*.
## Tandoor API Integration
The importer uses the Tandoor REST API:
1. **POST** `/api/recipe/` — create the full recipe in one call (name, description, source_url, steps with nested ingredients)
2. **PUT** `/api/recipe/{id}/image/` — upload the recipe image
**Step-based ingredients**: Tandoor nests ingredients inside steps. All ingredients are attached to the first step. Units and foods are auto-created by name (no separate resolution needed). Ingredient groups use `is_header: true` on a header entry.
**Duplicate detection**: Before import, searches Tandoor by title and checks the `source_url` field to detect already-imported recipes.
Authentication uses an API token (Bearer header), created in Tandoor at *Settings → API Browser → Auth Token*.
## Configuration
All settings are persisted to `/data/config.json` (mounted as a Docker volume).
| Setting | Description |
|---------|-------------|
| `mealie_url` | Full URL to Mealie instance (e.g. `https://mealie.example.com`) |
| `mealie_api_key` | Mealie API token |
| `tandoor_url` | Full URL to Tandoor instance (e.g. `https://recipes.example.com`) |
| `tandoor_api_key` | Tandoor API token |
## Deployment
### Docker Compose
```yaml
services:
recipe-importer:
image: gitea.dooplex.hu/admin/recipe-importer:0.1.9
container_name: recipe-importer
restart: unless-stopped
ports:
- "8011:8000"
volumes:
- recipe-data:/data
environment:
- SECRET_KEY=change-me-in-production
- MEALIE_INTERNAL_URL=http://mealie:9000
- TANDOOR_INTERNAL_URL=http://tandoor:8080
volumes:
recipe-data:
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `SECRET_KEY` | `recipe-importer-dev-key` | Flask session secret |
| `DATA_DIR` | `/data` | Persistent storage path |
| `VERSION` | `dev` | Shown in the UI navbar |
| `MEALIE_INTERNAL_URL` | *(empty)* | Docker-internal Mealie URL (e.g. `http://mealie:9000`) to avoid Cloudflare hairpin |
| `TANDOOR_INTERNAL_URL` | *(empty)* | Docker-internal Tandoor URL (e.g. `http://tandoor:8080`) to avoid Cloudflare hairpin |
## Building
On the build server (kisfenyo@192.168.0.180):
```bash
cd ~/build/recipe-importer
./build.sh X.X.X --push
```
## Web UI
The UI is in Hungarian and uses a dark theme. The workflow is:
1. **Settings** (`/settings`) — Configure Mealie and/or Tandoor connection (URL + API key), test each connection
2. **Import** (`/import`) — Paste a recipe URL, click "Beolvasás" (Scrape)
3. **Review** — Edit structured ingredients (4-column: quantity, unit, food, note), add/remove ingredient groups, edit instructions
4. **Send** — Click "Importálás Mealie-be" and/or "Importálás Tandoor-ba" to push to your configured services
## Tech Stack
- **Runtime**: Python 3.12 (slim)
- **Web framework**: Flask 3.1 + Gunicorn
- **HTML parsing**: BeautifulSoup 4 + lxml
- **HTTP client**: requests
- **Container**: ~60 MB image