# Recipe Importer Docker container for importing recipes from Hungarian websites into [Mealie](https://mealie.io/) and [Tandoor Recipes](https://tandoor.dev/). **Problem**: Mealie's and Tandoor's built-in URL import cannot parse ingredients and instructions from Hungarian recipe sites like mindmegette.hu. **Solution**: This container provides a web UI that scrapes Hungarian recipe pages with site-specific parsers, lets you review and edit the extracted data, then pushes it to Mealie and/or Tandoor via their REST APIs. ## Architecture ``` ┌──────────────────────────────────────────────────────┐ │ recipe-importer container (:8000) │ │ │ │ Flask + Gunicorn │ │ ├── /settings → Configure Mealie & Tandoor │ │ ├── /import → Paste URL, scrape, review │ │ ├── /scrape → AJAX: parse recipe HTML │ │ ├── /send → AJAX: push to Mealie API │ │ ├── /send-tandoor → AJAX: push to Tandoor API │ │ ├── /tags → AJAX: list tags from both │ │ └── /health → Health check │ │ │ │ Modules: │ │ ├── app/config.py → JSON config persistence │ │ ├── app/scraper.py → Site-specific parsers │ │ ├── app/mealie.py → Mealie REST API client │ │ └── app/tandoor.py → Tandoor REST API client │ └───────────────────┬──────────────┬───────────────────┘ │ HTTP │ HTTP ▼ ▼ ┌──────────────┐ ┌───────────────┐ │ Mealie │ │ Tandoor │ │ POST /api/.. │ │ POST /api/.. │ │ PUT /api/.. │ │ PUT /api/.. │ └──────────────┘ └───────────────┘ ``` ## Supported Sites | Site | Ingredients | Instructions | Image | Tags | |------|:-----------:|:------------:|:-----:|:----:| | mindmegette.hu | Yes | Yes | Yes | Yes | | *Other sites* | Fallback (schema.org JSON-LD) | Fallback (schema.org JSON-LD) | Yes (og:image) | Fallback (schema.org keywords) | ### Mindmegette.hu Parser Extracts data from the Angular-rendered HTML: - **Title**: `og:title` meta tag, with ` | Mindmegette.hu` suffix stripped - **Description**: `og:description` meta tag - **Image**: `og:image` meta tag - **Ingredients**: `div.ingredients` → `div.ingredients-meta` rows, each containing `` (qty), `` (unit), `` (food), `` (extra) - **Ingredient groups**: Multiple `div.ingredients` containers; group title via `` - **Instructions**: `mindmegette-wysiwyg-box` → `ol > li` elements - **Tags**: `` elements inside `div.desktop-wrapper` ### Generic Fallback Parser For unsupported sites, attempts extraction via: 1. Schema.org JSON-LD `@type: Recipe` blocks (`recipeIngredient`, `recipeInstructions`, `keywords`) 2. OpenGraph meta tags for title, description, image ### Adding a New Site Parser 1. Create a parser function in `app/scraper.py` with the `@_register("hostname")` decorator 2. The function receives `(soup: BeautifulSoup, url: str)` and returns the standard recipe dict 3. The hostname substring is matched against the URL — first match wins, unmatched URLs use the generic fallback ## Mealie API Integration The importer uses the Mealie REST API: 1. **POST** `/api/recipes` — create a stub recipe (returns slug) 2. **PATCH** `/api/recipes/{slug}` — populate structured ingredients (with unit/food IDs), instructions, description, orgURL 3. **PUT** `/api/recipes/{slug}/image` — upload the recipe image **Structured ingredients**: The client resolves unit and food names to Mealie database IDs. Missing units/foods are created automatically via the API. Ingredient groups are supported via the `title` field on the first ingredient of each group. Authentication uses a long-lived API token (Bearer header), created in Mealie at *Profile → API Tokens*. ## Tandoor API Integration The importer uses the Tandoor REST API: 1. **POST** `/api/recipe/` — create the full recipe in one call (name, description, source_url, steps with nested ingredients) 2. **PUT** `/api/recipe/{id}/image/` — upload the recipe image **Step-based ingredients**: Tandoor nests ingredients inside steps. All ingredients are attached to the first step. Units and foods are auto-created by name (no separate resolution needed). Ingredient groups use `is_header: true` on a header entry. **Duplicate detection**: Before import, searches Tandoor by title and checks the `source_url` field to detect already-imported recipes. Authentication uses an API token (Bearer header), created in Tandoor at *Settings → API Browser → Auth Token*. ## Tag Management Tags are scraped from recipe pages and shown as editable chips in the UI. Users can: - **Remove** scraped tags that are irrelevant - **Search** existing tags from Mealie and Tandoor (fetched via `GET /tags` endpoint) - **Add** custom tags by typing and pressing Enter Tags are sent to both services on import: - **Mealie**: Tags are created via `POST /api/organizers/tags` if they don't exist, then attached to the recipe in the PATCH payload - **Tandoor**: Keywords are auto-created by including `keywords: [{"name": "..."}]` in the recipe POST ## Configuration All settings are persisted to `/data/config.json` (mounted as a Docker volume). | Setting | Description | |---------|-------------| | `mealie_url` | Full URL to Mealie instance (e.g. `https://mealie.example.com`) | | `mealie_api_key` | Mealie API token | | `tandoor_url` | Full URL to Tandoor instance (e.g. `https://recipes.example.com`) | | `tandoor_api_key` | Tandoor API token | ## Deployment ### Docker Compose ```yaml services: recipe-importer: image: gitea.dooplex.hu/admin/recipe-importer:0.2.0 container_name: recipe-importer restart: unless-stopped ports: - "8011:8000" volumes: - recipe-data:/data environment: - SECRET_KEY=change-me-in-production - MEALIE_INTERNAL_URL=http://mealie:9000 - TANDOOR_INTERNAL_URL=http://tandoor:8080 volumes: recipe-data: ``` ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `SECRET_KEY` | `recipe-importer-dev-key` | Flask session secret | | `DATA_DIR` | `/data` | Persistent storage path | | `VERSION` | `dev` | Shown in the UI navbar | | `MEALIE_INTERNAL_URL` | *(empty)* | Docker-internal Mealie URL (e.g. `http://mealie:9000`) to avoid Cloudflare hairpin | | `TANDOOR_INTERNAL_URL` | *(empty)* | Docker-internal Tandoor URL (e.g. `http://tandoor:8080`) to avoid Cloudflare hairpin | ## Building On the build server (kisfenyo@192.168.0.180): ```bash cd ~/build/recipe-importer ./build.sh X.X.X --push ``` ## Web UI The UI is in Hungarian and uses a dark theme. The workflow is: 1. **Settings** (`/settings`) — Configure Mealie and/or Tandoor connection (URL + API key), test each connection 2. **Import** (`/import`) — Paste a recipe URL, click "Beolvasás" (Scrape) 3. **Review** — Edit structured ingredients (4-column: quantity, unit, food, note), add/remove ingredient groups, edit instructions, manage tags (add/remove/search existing) 4. **Send** — Click "Importálás Mealie-be" and/or "Importálás Tandoor-ba" to push to your configured services ## Tech Stack - **Runtime**: Python 3.12 (slim) - **Web framework**: Flask 3.1 + Gunicorn - **HTML parsing**: BeautifulSoup 4 + lxml - **HTTP client**: requests - **Container**: ~60 MB image