Files
scarf/scarf/docs/I18N.md
Alan Wizemann 1726a613a5 feat(i18n): add translations for zh-Hans, de, fr, es, ja, pt-BR
Ships first-pass AI translations for six locales on top of the existing
English base, plus a simple JSON-per-locale contributor workflow so new
languages can land as a single PR.

- 518 keys translated per locale (proper nouns / brand names / format-
  only strings left to fall back to English by design — see the
  "Non-blocking (intentional verbatim)" section of scarf/docs/I18N.md).
- Per-locale source-of-truth lives in tools/translations/<locale>.json;
  tools/merge-translations.py writes them into Localizable.xcstrings
  and is idempotent (re-runnable as translators iterate).
- InfoPlist.xcstrings (macOS microphone permission prompt) translated
  for all six locales.
- knownRegions expanded: zh-Hans, de, fr now join by es, ja, pt-BR.
- CONTRIBUTING.md gains an "Adding a Language" section documenting the
  fork → JSON → merge → PR flow. Native-speaker reviews welcome.

Closes #13 (the original ask: Simplified Chinese support).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:16:41 -07:00

5.6 KiB

Internationalization (i18n)

Scarf uses Apple's modern String Catalog workflow. Source strings are auto-extracted from Text("…") and String(localized: …) literals into scarf/scarf/Localizable.xcstrings at build time (when built in Xcode.app; xcodebuild alone emits per-source .stringsdata but does not merge back into the catalog). Info.plist keys are localized via scarf/scarf/InfoPlist.xcstrings.

Languages

Locale Status
en (English) Base / source
zh-Hans (Simplified Chinese) AI-translated, native-speaker review welcome
de (German) AI-translated, native-speaker review welcome
fr (French) AI-translated, native-speaker review welcome
es (Spanish) AI-translated, native-speaker review welcome
ja (Japanese) AI-translated, native-speaker review welcome
pt-BR (Portuguese, Brazil) AI-translated, native-speaker review welcome

Canadian French users are served by base fr. fr-CA will be added only if a concrete Québec-specific bug is reported.

Translation workflow

Source-of-truth per locale lives in tools/translations/<locale>.json — a flat { "English": "Translation" } map. The merge step writes those into scarf/scarf/Localizable.xcstrings via:

python3 tools/merge-translations.py

Keys absent from a locale file fall back to English at runtime — this is deliberate for proper nouns (Scarf, Hermes, Anthropic, OAuth, SSH…) and format-only strings (%lld, %@ → %@, ). Re-running the merge is idempotent; iterate on a JSON and re-merge.

Contributor path for new languages is documented in the repo root CONTRIBUTING.md.

Adding a new language

  1. Xcode → Project → Info → Localizations → + (add locale).
  2. Ensure the locale code is also listed in knownRegions of scarf.xcodeproj/project.pbxproj.
  3. Open Localizable.xcstrings in Xcode; the new locale appears as an empty column — translate or use Xcode's AI suggestions.
  4. Repeat for InfoPlist.xcstrings (microphone usage, etc.).
  5. Smoke-test via scheme language override (Edit Scheme → Run → App Language).

Adding translations (AI-first workflow)

For the three supported non-English locales we use Xcode's built-in AI translation:

  1. Open Localizable.xcstrings in Xcode.
  2. Select untranslated rows for a locale → right-click → Translate (Xcode 26+ provides GPT-backed suggestions with context from the surrounding code comment).
  3. Review each suggestion before marking Translated.
  4. For terms that should NOT translate (proper nouns like Scarf, Hermes, Anthropic; env var names; file paths), wrap the source site in Text(verbatim: "…") so the key never hits the catalog.

Guardrails when writing new UI code

Text("literal") auto-localizes. These patterns silently leak English and need explicit handling:

Pattern Fix
Text(someStringVar) Text(LocalizedStringResource("key")) or pass a LocalizedStringKey down the view tree
"Hello " + name String(localized: "Hello \(name)")
String(format: "$%.2f", cost) cost.formatted(.currency(code: "USD").precision(.fractionLength(2)))
String(format: "%.1f MB", size) Int64(size).formatted(.byteCount(style: .file))
String(format: "%.1fM", n) n.formatted(.number.notation(.compactName))
Custom DateFormatter with fixed dateFormat date.formatted(.dateTime.month().day().year())
.help(stringVar) Compute a LocalizedStringKey or use .help(Text(…))
Button(stringVar) Button(LocalizedStringResource("key")) { … }

Strings that are user data (session titles, memory file contents, log lines, shell commands shown in UI, file paths) should pass through without localization — this happens naturally when the value is a String variable, since those overloads skip the catalog.

Audit status

Phase 1b (the multi-language PR) closed every tracked site from the original audit:

  • Category A high-priority (ternary UI copy) — converted to Text-ternary form so each branch routes through LocalizedStringKey.
  • Category A medium-priority (enum .rawValue displays) — each enum now exposes displayName: LocalizedStringResource and call sites use it. LogEntry.LogLevel (technical jargon) stays verbatim.
  • Category A lower-priority (displayName passthroughs) — wrapped with Text(verbatim:) for proper nouns / user data (HermesToolPlatform, ServerRegistry.Entry, MCPServerPreset). MCPTransport.displayName promoted to LocalizedStringResource.
  • Category B (composite format strings) — migrated to Text("\(arg) suffix") with LocalizedStringKey or to .percent / .currency FormatStyle.
  • Category C (hard-coded day names) — replaced with Calendar.current.shortWeekdaySymbols, re-indexed to match the existing Mon=0 data model.
  • Category D (.help(stringVar) sites)ConnectionStatusPill now returns Text from its labelText / tooltipText properties.

If you spot a new silently-un-localizable site during translation review, prefer the patterns in the table above over one-off workarounds.

Non-blocking (intentional verbatim)

The following are correct as-is because they pass user data or machine-readable content through to the UI:

  • Session titles, message content, memory / skill / YAML file contents, log lines, shell commands, file paths, session IDs, model IDs, credential sources, URL strings.

If we later need to badge these (e.g. "(empty)" placeholder), the badge itself becomes a localizable key while the data passthrough stays verbatim.