Files
scarf/scarf/docs/I18N.md
T
Alan Wizemann 1726a613a5 feat(i18n): add translations for zh-Hans, de, fr, es, ja, pt-BR
Ships first-pass AI translations for six locales on top of the existing
English base, plus a simple JSON-per-locale contributor workflow so new
languages can land as a single PR.

- 518 keys translated per locale (proper nouns / brand names / format-
  only strings left to fall back to English by design — see the
  "Non-blocking (intentional verbatim)" section of scarf/docs/I18N.md).
- Per-locale source-of-truth lives in tools/translations/<locale>.json;
  tools/merge-translations.py writes them into Localizable.xcstrings
  and is idempotent (re-runnable as translators iterate).
- InfoPlist.xcstrings (macOS microphone permission prompt) translated
  for all six locales.
- knownRegions expanded: zh-Hans, de, fr now join by es, ja, pt-BR.
- CONTRIBUTING.md gains an "Adding a Language" section documenting the
  fork → JSON → merge → PR flow. Native-speaker reviews welcome.

Closes #13 (the original ask: Simplified Chinese support).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:16:41 -07:00

85 lines
5.6 KiB
Markdown

# Internationalization (i18n)
Scarf uses Apple's modern **String Catalog** workflow. Source strings are auto-extracted from `Text("…")` and `String(localized: …)` literals into [`scarf/scarf/Localizable.xcstrings`](../scarf/Localizable.xcstrings) at build time (when built in Xcode.app; `xcodebuild` alone emits per-source `.stringsdata` but does not merge back into the catalog). Info.plist keys are localized via [`scarf/scarf/InfoPlist.xcstrings`](../scarf/InfoPlist.xcstrings).
## Languages
| Locale | Status |
|---|---|
| `en` (English) | Base / source |
| `zh-Hans` (Simplified Chinese) | AI-translated, native-speaker review welcome |
| `de` (German) | AI-translated, native-speaker review welcome |
| `fr` (French) | AI-translated, native-speaker review welcome |
| `es` (Spanish) | AI-translated, native-speaker review welcome |
| `ja` (Japanese) | AI-translated, native-speaker review welcome |
| `pt-BR` (Portuguese, Brazil) | AI-translated, native-speaker review welcome |
Canadian French users are served by base `fr`. `fr-CA` will be added only if a concrete Québec-specific bug is reported.
### Translation workflow
Source-of-truth per locale lives in `tools/translations/<locale>.json` — a flat `{ "English": "Translation" }` map. The merge step writes those into `scarf/scarf/Localizable.xcstrings` via:
```bash
python3 tools/merge-translations.py
```
Keys absent from a locale file fall back to English at runtime — this is deliberate for proper nouns (Scarf, Hermes, Anthropic, OAuth, SSH…) and format-only strings (`%lld`, `%@ → %@`, `•`). Re-running the merge is idempotent; iterate on a JSON and re-merge.
Contributor path for new languages is documented in the repo root [CONTRIBUTING.md](../../CONTRIBUTING.md#adding-a-language).
## Adding a new language
1. Xcode → Project → Info → Localizations → `+` (add locale).
2. Ensure the locale code is also listed in `knownRegions` of `scarf.xcodeproj/project.pbxproj`.
3. Open `Localizable.xcstrings` in Xcode; the new locale appears as an empty column — translate or use Xcode's AI suggestions.
4. Repeat for `InfoPlist.xcstrings` (microphone usage, etc.).
5. Smoke-test via scheme language override (Edit Scheme → Run → App Language).
## Adding translations (AI-first workflow)
For the three supported non-English locales we use Xcode's built-in AI translation:
1. Open `Localizable.xcstrings` in Xcode.
2. Select untranslated rows for a locale → right-click → **Translate** (Xcode 26+ provides GPT-backed suggestions with context from the surrounding code comment).
3. Review each suggestion before marking **Translated**.
4. For terms that should NOT translate (proper nouns like *Scarf*, *Hermes*, *Anthropic*; env var names; file paths), wrap the source site in `Text(verbatim: "…")` so the key never hits the catalog.
## Guardrails when writing new UI code
`Text("literal")` auto-localizes. These patterns **silently leak English** and need explicit handling:
| Pattern | Fix |
|---|---|
| `Text(someStringVar)` | `Text(LocalizedStringResource("key"))` or pass a `LocalizedStringKey` down the view tree |
| `"Hello " + name` | `String(localized: "Hello \(name)")` |
| `String(format: "$%.2f", cost)` | `cost.formatted(.currency(code: "USD").precision(.fractionLength(2)))` |
| `String(format: "%.1f MB", size)` | `Int64(size).formatted(.byteCount(style: .file))` |
| `String(format: "%.1fM", n)` | `n.formatted(.number.notation(.compactName))` |
| Custom `DateFormatter` with fixed `dateFormat` | `date.formatted(.dateTime.month().day().year())` |
| `.help(stringVar)` | Compute a `LocalizedStringKey` or use `.help(Text(…))` |
| `Button(stringVar)` | `Button(LocalizedStringResource("key")) { … }` |
Strings that are **user data** (session titles, memory file contents, log lines, shell commands shown in UI, file paths) should pass through without localization — this happens naturally when the value is a `String` variable, since those overloads skip the catalog.
## Audit status
Phase 1b (the `multi-language` PR) closed every tracked site from the original audit:
- **Category A high-priority (ternary UI copy)** — converted to `Text`-ternary form so each branch routes through `LocalizedStringKey`.
- **Category A medium-priority (enum `.rawValue` displays)** — each enum now exposes `displayName: LocalizedStringResource` and call sites use it. `LogEntry.LogLevel` (technical jargon) stays verbatim.
- **Category A lower-priority (displayName passthroughs)** — wrapped with `Text(verbatim:)` for proper nouns / user data (`HermesToolPlatform`, `ServerRegistry.Entry`, `MCPServerPreset`). `MCPTransport.displayName` promoted to `LocalizedStringResource`.
- **Category B (composite format strings)** — migrated to `Text("\(arg) suffix")` with `LocalizedStringKey` or to `.percent` / `.currency` FormatStyle.
- **Category C (hard-coded day names)** — replaced with `Calendar.current.shortWeekdaySymbols`, re-indexed to match the existing Mon=0 data model.
- **Category D (`.help(stringVar)` sites)** — `ConnectionStatusPill` now returns `Text` from its `labelText` / `tooltipText` properties.
If you spot a new silently-un-localizable site during translation review, prefer the patterns in the table above over one-off workarounds.
### Non-blocking (intentional verbatim)
The following are correct as-is because they pass user data or machine-readable content through to the UI:
- Session titles, message content, memory / skill / YAML file contents, log lines, shell commands, file paths, session IDs, model IDs, credential sources, URL strings.
If we later need to badge these (e.g. "(empty)" placeholder), the badge itself becomes a localizable key while the data passthrough stays verbatim.