458b1e362a
Add TandoorClient (app/tandoor.py) with full recipe creation, image upload, and duplicate detection via the Tandoor REST API. Settings page now has separate Mealie and Tandoor sections. Import page shows both send buttons based on which services are configured. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
162 lines
7.2 KiB
Markdown
162 lines
7.2 KiB
Markdown
# Recipe Importer
|
|
|
|
Docker container for importing recipes from Hungarian websites into [Mealie](https://mealie.io/) and [Tandoor Recipes](https://tandoor.dev/).
|
|
|
|
**Problem**: Mealie's and Tandoor's built-in URL import cannot parse ingredients and instructions from Hungarian recipe sites like mindmegette.hu.
|
|
|
|
**Solution**: This container provides a web UI that scrapes Hungarian recipe pages with site-specific parsers, lets you review and edit the extracted data, then pushes it to Mealie and/or Tandoor via their REST APIs.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────┐
|
|
│ recipe-importer container (:8000) │
|
|
│ │
|
|
│ Flask + Gunicorn │
|
|
│ ├── /settings → Configure Mealie & Tandoor │
|
|
│ ├── /import → Paste URL, scrape, review │
|
|
│ ├── /scrape → AJAX: parse recipe HTML │
|
|
│ ├── /send → AJAX: push to Mealie API │
|
|
│ ├── /send-tandoor → AJAX: push to Tandoor API │
|
|
│ └── /health → Health check │
|
|
│ │
|
|
│ Modules: │
|
|
│ ├── app/config.py → JSON config persistence │
|
|
│ ├── app/scraper.py → Site-specific parsers │
|
|
│ ├── app/mealie.py → Mealie REST API client │
|
|
│ └── app/tandoor.py → Tandoor REST API client │
|
|
└───────────────────┬──────────────┬───────────────────┘
|
|
│ HTTP │ HTTP
|
|
▼ ▼
|
|
┌──────────────┐ ┌───────────────┐
|
|
│ Mealie │ │ Tandoor │
|
|
│ POST /api/.. │ │ POST /api/.. │
|
|
│ PUT /api/.. │ │ PUT /api/.. │
|
|
└──────────────┘ └───────────────┘
|
|
```
|
|
|
|
## Supported Sites
|
|
|
|
| Site | Ingredients | Instructions | Image |
|
|
|------|:-----------:|:------------:|:-----:|
|
|
| mindmegette.hu | Yes | Yes | Yes |
|
|
| *Other sites* | Fallback (schema.org JSON-LD) | Fallback (schema.org JSON-LD) | Yes (og:image) |
|
|
|
|
### Mindmegette.hu Parser
|
|
|
|
Extracts data from the Angular-rendered HTML:
|
|
|
|
- **Title**: `og:title` meta tag, with ` | Mindmegette.hu` suffix stripped
|
|
- **Description**: `og:description` meta tag
|
|
- **Image**: `og:image` meta tag
|
|
- **Ingredients**: `div.ingredients` → `div.ingredients-meta` rows, each containing `<strong>` (qty), `<span>` (unit), `<a class="ingredients-link">` (food), `<small>` (extra)
|
|
- **Ingredient groups**: Multiple `div.ingredients` containers; group title via `<strong class="ingredients-group">`
|
|
- **Instructions**: `mindmegette-wysiwyg-box` → `ol > li` elements
|
|
|
|
### Generic Fallback Parser
|
|
|
|
For unsupported sites, attempts extraction via:
|
|
1. Schema.org JSON-LD `@type: Recipe` blocks (`recipeIngredient`, `recipeInstructions`)
|
|
2. OpenGraph meta tags for title, description, image
|
|
|
|
### Adding a New Site Parser
|
|
|
|
1. Create a parser function in `app/scraper.py` with the `@_register("hostname")` decorator
|
|
2. The function receives `(soup: BeautifulSoup, url: str)` and returns the standard recipe dict
|
|
3. The hostname substring is matched against the URL — first match wins, unmatched URLs use the generic fallback
|
|
|
|
## Mealie API Integration
|
|
|
|
The importer uses the Mealie REST API:
|
|
|
|
1. **POST** `/api/recipes` — create a stub recipe (returns slug)
|
|
2. **PATCH** `/api/recipes/{slug}` — populate structured ingredients (with unit/food IDs), instructions, description, orgURL
|
|
3. **PUT** `/api/recipes/{slug}/image` — upload the recipe image
|
|
|
|
**Structured ingredients**: The client resolves unit and food names to Mealie database IDs. Missing units/foods are created automatically via the API. Ingredient groups are supported via the `title` field on the first ingredient of each group.
|
|
|
|
Authentication uses a long-lived API token (Bearer header), created in Mealie at *Profile → API Tokens*.
|
|
|
|
## Tandoor API Integration
|
|
|
|
The importer uses the Tandoor REST API:
|
|
|
|
1. **POST** `/api/recipe/` — create the full recipe in one call (name, description, source_url, steps with nested ingredients)
|
|
2. **PUT** `/api/recipe/{id}/image/` — upload the recipe image
|
|
|
|
**Step-based ingredients**: Tandoor nests ingredients inside steps. All ingredients are attached to the first step. Units and foods are auto-created by name (no separate resolution needed). Ingredient groups use `is_header: true` on a header entry.
|
|
|
|
**Duplicate detection**: Before import, searches Tandoor by title and checks the `source_url` field to detect already-imported recipes.
|
|
|
|
Authentication uses an API token (Bearer header), created in Tandoor at *Settings → API Browser → Auth Token*.
|
|
|
|
## Configuration
|
|
|
|
All settings are persisted to `/data/config.json` (mounted as a Docker volume).
|
|
|
|
| Setting | Description |
|
|
|---------|-------------|
|
|
| `mealie_url` | Full URL to Mealie instance (e.g. `https://mealie.example.com`) |
|
|
| `mealie_api_key` | Mealie API token |
|
|
| `tandoor_url` | Full URL to Tandoor instance (e.g. `https://recipes.example.com`) |
|
|
| `tandoor_api_key` | Tandoor API token |
|
|
|
|
## Deployment
|
|
|
|
### Docker Compose
|
|
|
|
```yaml
|
|
services:
|
|
recipe-importer:
|
|
image: gitea.dooplex.hu/admin/recipe-importer:0.1.9
|
|
container_name: recipe-importer
|
|
restart: unless-stopped
|
|
ports:
|
|
- "8011:8000"
|
|
volumes:
|
|
- recipe-data:/data
|
|
environment:
|
|
- SECRET_KEY=change-me-in-production
|
|
- MEALIE_INTERNAL_URL=http://mealie:9000
|
|
- TANDOOR_INTERNAL_URL=http://tandoor:8080
|
|
|
|
volumes:
|
|
recipe-data:
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `SECRET_KEY` | `recipe-importer-dev-key` | Flask session secret |
|
|
| `DATA_DIR` | `/data` | Persistent storage path |
|
|
| `VERSION` | `dev` | Shown in the UI navbar |
|
|
| `MEALIE_INTERNAL_URL` | *(empty)* | Docker-internal Mealie URL (e.g. `http://mealie:9000`) to avoid Cloudflare hairpin |
|
|
| `TANDOOR_INTERNAL_URL` | *(empty)* | Docker-internal Tandoor URL (e.g. `http://tandoor:8080`) to avoid Cloudflare hairpin |
|
|
|
|
## Building
|
|
|
|
On the build server (kisfenyo@192.168.0.180):
|
|
|
|
```bash
|
|
cd ~/build/recipe-importer
|
|
./build.sh X.X.X --push
|
|
```
|
|
|
|
## Web UI
|
|
|
|
The UI is in Hungarian and uses a dark theme. The workflow is:
|
|
|
|
1. **Settings** (`/settings`) — Configure Mealie and/or Tandoor connection (URL + API key), test each connection
|
|
2. **Import** (`/import`) — Paste a recipe URL, click "Beolvasás" (Scrape)
|
|
3. **Review** — Edit structured ingredients (4-column: quantity, unit, food, note), add/remove ingredient groups, edit instructions
|
|
4. **Send** — Click "Importálás Mealie-be" and/or "Importálás Tandoor-ba" to push to your configured services
|
|
|
|
## Tech Stack
|
|
|
|
- **Runtime**: Python 3.12 (slim)
|
|
- **Web framework**: Flask 3.1 + Gunicorn
|
|
- **HTML parsing**: BeautifulSoup 4 + lxml
|
|
- **HTTP client**: requests
|
|
- **Container**: ~60 MB image
|