Add gastrohobbi.hu parser (WPBakery page builder layout): ingredients
with groups, instructions with embedded lists, tags from JSON-LD
articleSection, prep time extraction.
Fix ingredient line parser: fractions like "1/2" no longer split due to
regex backtracking, en-dash ranges normalized, unicode fractions (½¼¾)
recognized as quantity start across all parsers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use get_text(" ") with whitespace normalization to preserve spaces
between text nodes and <a> tag content in ingredient lines
- Use non-greedy .+? for unit in dual measurement regex to handle
multi-word units like "kis fej"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New parser for kiskegyed.hu: ingredients (with groups, dual measurements),
instructions (ol > li > div), tags (section.tags)
- Dual measurement handling: "3 ek (70 g)" extracts alternate measurement
to comment field
- Cross-site linking: kiskegyed→sobors links are followed to get full recipe
(mirrors existing sobors→kiskegyed support)
- Supported sites now shown as clickable URLs in the import page
- supported_sites() returns dicts with name and url
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Global post-processing in scrape() extracts trailing (comment) from
ingredient food names into the extra/comment field. Works for all parsers.
- Added "Importálás mindkettőbe" button on single import page when both
Mealie and Tandoor are configured.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Sobors.hu parser: detect external links in instructions and follow them
to scrape real recipe content (e.g. kiskegyed.hu linked recipes)
- Article-style ingredient fallback for sobors.hu pages without structured
ingredient containers (h4 + ul > li plain text)
- Favicon changed to logo_notext_white.svg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New sobors.hu parser with ingredient groups and section headers
- Incomplete recipe warnings (missing ingredients/instructions)
- Optional HTTP Basic Auth (configurable on settings page)
- Brand text: "Recept" in white, "Importáló" in blue
- Larger logo (36px), favicon using logo_notext.svg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The m-tags__tagItem class is used site-wide for SEO/navigation links.
Scope tag extraction to div.p-recipe__attributeList to only get
actual recipe tags.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extracts ingredients (with groups), instructions (with section
headers), tags, and story-as-description from nosalty.hu recipe pages.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handles three instruction layouts: ol steps, ul steps, and paragraph-style.
Parses merged qty+unit strings (e.g. "200g" → qty=200, unit=g).
Deduplicates ingredients by targeting the specific grid container.
Tags extracted from JSON-LD recipeCategory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Scraper extracts tags from mindmegette.hu (<a class="tag">) and schema.org keywords
- Tag editor UI with removable chips, search/autocomplete for existing tags, custom add
- Mealie: auto-create tags via POST /api/organizers/tags, include in recipe PATCH
- Tandoor: include keywords in recipe POST (auto-created by name)
- New GET /tags endpoint returns existing tags from both services for search
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix ingredient groups creating empty entries in Mealie: set title
field on the first ingredient after the group marker instead
- Refactor scraper with @_register decorator for URL-based site dispatch
- Update README with structured ingredients, groups, MEALIE_INTERNAL_URL
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix Mealie image upload 422: send required `extension` field in form data
- Parse ingredient groups from mindmegette (multiple div.ingredients
containers with strong.ingredients-group titles)
- Show group headers in UI with dashed-border accent input
- Pass group markers through to Mealie as title-only ingredient entries
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The scraper looked for span.quantity/span.unit/span.name which don't
exist. The real HTML uses <strong> for qty, plain <span> for unit,
<a class="ingredients-link"> for name, and <small> for extras like
"(darált)". Also add referenceId to Mealie ingredients (required field).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Python/Flask web app that scrapes Hungarian recipe sites (mindmegette.hu)
and imports them into Mealie via its REST API. Includes dark-themed web UI
with editable preview, Dockerfile, build script, and docker-compose.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>