Files
recipe-importer/README.md
T
admin a27b322409 fix: group title on first ingredient + multi-site parser registry
- Fix ingredient groups creating empty entries in Mealie: set title
  field on the first ingredient after the group marker instead
- Refactor scraper with @_register decorator for URL-based site dispatch
- Update README with structured ingredients, groups, MEALIE_INTERNAL_URL

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 08:51:14 +01:00

5.7 KiB

Recipe Importer

Docker container for importing recipes from Hungarian websites into Mealie (Tandoor support planned).

Problem: Mealie's built-in URL import cannot parse ingredients and instructions from Hungarian recipe sites like mindmegette.hu — it imports the title and image but shows "Could not detect ingredients / instructions".

Solution: This container provides a web UI that scrapes Hungarian recipe pages with site-specific parsers, lets you review and edit the extracted data, then pushes it to Mealie via its REST API.

Architecture

┌─────────────────────────────────────────────────┐
│  recipe-importer container (:8000)              │
│                                                 │
│  Flask + Gunicorn                               │
│  ├── /settings    → Configure Mealie connection │
│  ├── /import      → Paste URL, scrape, review   │
│  ├── /scrape      → AJAX: parse recipe HTML     │
│  ├── /send        → AJAX: push to Mealie API    │
│  └── /health      → Health check                │
│                                                 │
│  Modules:                                       │
│  ├── app/config.py   → JSON config persistence  │
│  ├── app/scraper.py  → Site-specific parsers    │
│  └── app/mealie.py   → Mealie REST API client   │
└───────────────────┬─────────────────────────────┘
                    │ HTTP
                    ▼
         ┌──────────────────┐
         │  Mealie instance  │
         │  POST /api/recipes│
         │  PATCH /api/...   │
         │  PUT /api/.../img │
         └──────────────────┘

Supported Sites

Site Ingredients Instructions Image
mindmegette.hu Yes Yes Yes
Other sites Fallback (schema.org JSON-LD) Fallback (schema.org JSON-LD) Yes (og:image)

Mindmegette.hu Parser

Extracts data from the Angular-rendered HTML:

  • Title: og:title meta tag, with | Mindmegette.hu suffix stripped
  • Description: og:description meta tag
  • Image: og:image meta tag
  • Ingredients: div.ingredientsdiv.ingredients-meta rows, each containing <strong> (qty), <span> (unit), <a class="ingredients-link"> (food), <small> (extra)
  • Ingredient groups: Multiple div.ingredients containers; group title via <strong class="ingredients-group">
  • Instructions: mindmegette-wysiwyg-boxol > li elements

Generic Fallback Parser

For unsupported sites, attempts extraction via:

  1. Schema.org JSON-LD @type: Recipe blocks (recipeIngredient, recipeInstructions)
  2. OpenGraph meta tags for title, description, image

Adding a New Site Parser

  1. Create a parser function in app/scraper.py with the @_register("hostname") decorator
  2. The function receives (soup: BeautifulSoup, url: str) and returns the standard recipe dict
  3. The hostname substring is matched against the URL — first match wins, unmatched URLs use the generic fallback

Mealie API Integration

The importer uses the Mealie REST API:

  1. POST /api/recipes — create a stub recipe (returns slug)
  2. PATCH /api/recipes/{slug} — populate structured ingredients (with unit/food IDs), instructions, description, orgURL
  3. PUT /api/recipes/{slug}/image — upload the recipe image

Structured ingredients: The client resolves unit and food names to Mealie database IDs. Missing units/foods are created automatically via the API. Ingredient groups are supported via the title field on the first ingredient of each group.

Authentication uses a long-lived API token (Bearer header), created in Mealie at Profile → API Tokens.

Configuration

All settings are persisted to /data/config.json (mounted as a Docker volume).

Setting Description
mealie_url Full URL to Mealie instance (e.g. https://mealie.example.com)
mealie_api_key Mealie API token

Deployment

Docker Compose

services:
  recipe-importer:
    image: gitea.dooplex.hu/admin/recipe-importer:0.1.7
    container_name: recipe-importer
    restart: unless-stopped
    ports:
      - "8011:8000"
    volumes:
      - recipe-data:/data
    environment:
      - SECRET_KEY=change-me-in-production

volumes:
  recipe-data:

Environment Variables

Variable Default Description
SECRET_KEY recipe-importer-dev-key Flask session secret
DATA_DIR /data Persistent storage path
VERSION dev Shown in the UI navbar
MEALIE_INTERNAL_URL (empty) Docker-internal Mealie URL (e.g. http://mealie:9000) to avoid Cloudflare hairpin

Building

On the build server (kisfenyo@192.168.0.180):

cd ~/build/recipe-importer
./build.sh X.X.X --push

Web UI

The UI is in Hungarian and uses a dark theme. The workflow is:

  1. Settings (/settings) — Enter Mealie URL and API key, test connection
  2. Import (/import) — Paste a recipe URL, click "Beolvasás" (Scrape)
  3. Review — Edit structured ingredients (4-column: quantity, unit, food, note), add/remove ingredient groups, edit instructions
  4. Send — Click "Importálás Mealie-be" to push to Mealie

Tech Stack

  • Runtime: Python 3.12 (slim)
  • Web framework: Flask 3.1 + Gunicorn
  • HTML parsing: BeautifulSoup 4 + lxml
  • HTTP client: requests
  • Container: ~60 MB image