[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Ce chapitre explique les objectifs recherchés par la création de GNU
gettext
et du Projet de Traduction Libre (ndt free Translation
Project). Ensuite il explique quelques concepts généraux autour du Support
des Langues Nationales et la position des traductions de message en regard
d'autres aspects des variations culturelles telle qu'elles s'appliquent pour
les programmes. Il étudie aussi ces fichiers qui transmettent les
traductions. Il explique comment les divers outils interagissent dans la
génération initiale de ces fichiers et ensuite comment le cycle de
maintenance doit normalement prendre place.
Dans ce manuel, nous utilisons le singulier masculin quand nous parlons du
programmeur ou du mainteneur, nous utilisons le singulier féminain quand
nous parlons du traducteur (ndt, pardon de la traductrice :-) et nous
utilisons le pluriel quand nous parlons des installateurs ou des
utilisateurs finaux des programmes traduits. Ceci est fait uniquement pour
le souci pratique de clarifier la documentation. Il ne s'agit en aucun
cas de signifier que certains rôles seraient plus appropriés pour les
hommes ou pour les femmes. D'autres parts, comme vous l'aurez peut être déjÃ
deviné, la suite d'utilitaires GNU gettext
a pour objectif d'être
utile pour toute personne utilisant un ordinateur, quelque soit son sexe,
son origine, sa religion ou sa nationalité !
Merci d'envoyer les suggestions ou corrections (ndt en anglais) Ã :
Internet address:
bug-gnu-gettext@gnu.org
|
Merci d'inclure le numéro d'édition du manuel et de mise à jour dans vos messages.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
Habituellement, les programmes sont écrits et documentés en anglais et l'anglais est utilisé pour inter-agir avec l'utilisateur au moment de l'exécution.Ceci n'est pas seulement vrai pour les logiciels GNU, mais aussi pour un grand nombre de logiciels libres ou propriétaires. L'utilisation d'un language commun est assez pratique pour la communication entre les développeurs, les mainteneurs de programme et les utilisateurs de tous les pays. Mais d'un autre côté, la plupart des gens sont moins à l'aise avec l'anglais qu'aver leur propre langue maternelle et préfèreraient utiliser leur langue maternelle pour leur travail du jour-le-jour autant que faire ce peut. Beaucoup adoreraient simplement de voir leur écran d'ordinateur montrer un peu moins d'anglais et beaucoup plus de leur propre langue.
Cependant pour beaucoup de personnes ce rêve peut paraître tant hors de portée, qu'elles ne pensent même pas qu'il soit utile de passer du temps à y réfléchir.Elles ne pensent qu'en aucune manière ce rêve puisse un jour se réaliser.Mais cependant quelques unes n'ont pas perdu espoir et se sont organisées.Le projet de traduction (ndt Translation Project) est une formalisation de cet espoir en une structure pour y travailler, qui a une bonne chance de nous rapprocher tous de l'achèvement d'un véritable jeu de programmmes multilingues.
GNU gettext
is an important step for the Translation Project, as it
is an asset on which we may build many other steps. This package offers to
programmers, translators and even users, a well integrated set of tools and
documentation. Specifically, the GNU gettext
utilities are a set of
tools that provides a framework within which other free packages may produce
multi-lingual messages. These tools include
GNU gettext
est conçu pour minimiser l'impact de
l'internationalisation des programmes sources, gardant cet impact aussi
discret que possible. L'internationalisation a de meilleures chances de
succès si elle reste légère, ou du moins semble l'être quand on regarde les
codes sources.
La Projet de Traduction utilise aussi la distribution GNU gettex
comme un vecteur pour documenter sa structure et ses méthodes. Ceci va au
delà de la stricte technicité de documenté proprement GNU gettext
. Ce
faisant, les traducteurs trouveront à un endoit unique, autant que possible,
tout ce qu'ils doivent savoir pour faire correctement leur tâche de
traduction. Bien qu'en fait, cette documentation supplémentaire puisse aussi
aider les programmeurs comme les utilisateurs curieux à comprendre comment
GNU gettext
est lié au reste du Projet de Traduction et d'avoir par
conséquent un aperçu de l'image d'ensemble.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Les longs mots apparaissent sans cesse quand nous parlons des diverses langues dans les programmes et ces mots ont des significations précises, qu'il convient d'expliquer ici une bonne fois pour toutes dans ce documents. Ces mots sont internationalisation et localisation. Beaucoup de personnes, lasses de les écrire et ré-écrire, ont pris l'habitude de les remplacer par i18n et l10n, citant la première la première et la dernière lettre de chacun de ces mots et remplaçant les lettres intermédiaires par un nombre ne vous donnant que le nombre de lettres qu'il devrait y avoir. Mais dans ce manuel, pour plus de clareté, nous écrirons patiemment les noms en entiers à chaque fois…
Par internationalisation, on réfère à l'opération par laquelle un
programme, ou un ensemble de programmes groupés un progiciel, tient compte
des multiples langues et est capable de les supporter. C'est un procédé de
généralisation, par lequel un programme n'est plus contraint d'utiliser des
phrases uniquement en anglais ainsi que des habitudes spécifiquement
anglaises, mais fait à la place le même chose de manière générique.Les
développeurs de programmes peuvent utiliser plusieures techniques pour
internationaliser leurs programmes. Quelques unes d'entre elles ont été
standardisées. GNU gettext
offre l'un de ces
standards. Voir la section Le point de vue du programmeur.
Par localisation on signifie l'opération par laquelle, dans un ensemble de programmes déjà internationalisés, on donne au programme toute l'information nécessaire pour qu'il puisse s'adapter pour gérer son entrée et sa sortie d'une façon qui soit correcte pour la langue et les habitudes culturelles choisies. C'est un procédé de particularisation, par lequel des méthodes génériques déjà implémentées dans un programme internationalisé sont utilisées de manières spécifiques. L'environnement de programmation met à disposition du programmeur plusieurs fonctions qui permettent la configuration pendant le temps d'execution. La description formelle d'un jeu d'habitudes culturelles pour quelque pays associés avec toutes les traductions visant une même langue est appelée la localisation pour cette langue ou pays. Les utilisateurs conservent les programmes localisés en définissant avant d'exécuter ces programmes, leur propres valeurs dans des variables d'environnement et en identifiant quelle localisaton doit être utilisée.
En fait le support des messages locaux de localisation n'est qu'un composant des données culturelles qui font un mode particulier. Il y a tout un serveur présentant des routines et de fonctions qui sont mises à disposition pour aider les programmeurs à développer des logiciels internationalisé et qui leur permet d'accéder aux données conservées sous une localisation particulière. Quand quelqu'un fait actuellement référence à une localisation particulière, il fait évidemment référence aux données stockées dans cette localisation particulière. De manière analogue, si un programmeur fait référence à « l'accès aux routines localisées » il fait en fait référence à la suite complète de routines, qui accèdent à toutes les informations de localisation.
On utilise l'expression Support des Langues Nationales ou plutôt SLN pour parler des activités générales ou des fonctionalités qui couvrent l'ensemble internationalisation et localisation, permettant les interactions multilingues dans un programme. En condensé, on pourrait dire que l'internationalisation et l'opération par laquelle les futures localisations sont rendues possibles.
On peut aussi dire de manière très sommaire, que quand il s'agit de message multilingues, l'internationalisation et la plupart du temps pris en charge par les programmeurs et la localisation par les traductrices.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Pour arriver à une distribution complètement multilingue, il y a beaucoup de choses à traduire qui vont au delà des messages de sortie.
gettext
offre un jeu d'outils complet pour traduire
les messages de sortie des programmes en C. Les scripts Perl et Shell
auront aussi besoin d'être traduits. Même s'il y a aujourd'hui quelques
trucs par lesquel ceci peut être fait, ces trucs ne sont pas aussi bien
intégrés à ce qu'ils devraient.
autoconf
ou bison
, sont capables
de produire d'autres programmes (ou scripts). Même si les programmes de
génération sont eux mêmes internationalisés, les programmes générés qu'ils
produisent peuvent avoir besoin eux-mêmes d'être internationalisés et cette
internationalisation indirecte pourrait être internationalisée directement
depuis le programme de génération. En fait, habituellement, les programmes
de génération et les programmes générés peuvent être internationaliés
indépendamment car les efforts requis sont plutôt divergents.
recodage
et capable de reconstruire à l'exécution. Comme ces
descriptions sont extraites de la norme RFC par des moyens mécaniques, leur
traductions correctes demanderait de traduire d'abord le tableau de la RFC
lui-même.
gcc
permette les
caractères diacritiques dans des les identifiant ou utilise des mots clés
traduits ; ‘rm -i’ pourrait accepter quelque chose d'autre que
‘y’ ou ‘n’ comme réponse etc… Même si le programme va
finalement écrire la majorité de ses sorties en langue non-anglaise, on doit
décider si la syntaxe d'entrée, les valeurs d'options etc… sont
localisées ou non.
Comme nous l'avons souligné, la traduction n'est qu'un aspect de la
localisation. D'autres aspects de l'internationalisation sont les services
du système qui sont gérés par GNU libc
. Il y a beaucoup d'attributs
qui sont requis pour définir les conventions culturelles d'un pays. Ces
attributs incluent à côté de la langue nationale du pays, le formattage des
dates et des temps, la représentation des nombres, les symboles monétaires
etc… Ces règles locales déterminent les paramètres de
localisation du pays. Cette localisation représente un connaissance requise
pour supporter les attributs nationaux du pays.
Il quelques domaines majeurs, qui peuvent varier selon les pays et qui
définissent donc ce que les paramètres de localisation doivent décrire. La
liste qui suit aide à metter les messages multilingues dans leur contexte
propre et sur d'autres tâches liées à la localisation. Voir le manuel GNU
libc
pour plus de détails.
Le jeu de codes les plus couramment utilisés aux États Unis d'Amérique etpour la plupart des régions anglophones dans le monde est le jeu de code ASCII. Cependant, il y a beaucoup de caractères requis par différentes localisations qui ne sont pas compris dans ce jeu de code. Le jeu de codes sur 8-bits ISO 8859-1 a la plupart des caractères spéciaux requis pour la plupart des langues européennes. Cependant, dans beaucoup de cas, choisir l'ISO 8859-1 n'est néanmoins pas adéquat : il ne gère même pas la monaie principale de l'Europe. Ainsi chaque localisation devra avoir une routine de gestion des caractères appropriée pour s'en sortir avec le jeu de code.
Le symbole utilisé varie d'un pays à l'autre comme la position utilisée par ce symbole. Les logiciels ont besoin de pouvoir afficher les nombres monétaires de manière transparent dans le mode national de chaque localisation.
Le format des dates varie entre les différentes localisations. Par exemple, le jour de Noël en 1994 est écrit 12/25/94 aux États Unis d'Amérique et25/12/94 en Australie. D'autres payus peuvent utiliser des dates 8601 etc…
L'heure du jour peut être notée hh:mm, hh.mm ou encore autrement. Certaines localisations requièrent que le temps soit spécifié sur vingt quatre heures plutôt que d'utiliser les matin (ndt AM)et après-midi (ndt PM). D'autre part la nature et la répartition annuelle de la correction d'heure d'été et d'heure d'hivers peut varier beaucoup entre les pays.
Les nombres peut être représentés différemment selon les localisation. Par exemple le nombre suivant sont tous écrits correctement dans leurs localisatons repectives :
12,345.67 anglais 12.345,67 allemand 12 345,67 français 1,2345.67 asie |
Quelques programme peuvent aller plus loin et utiliser différents systèmes d'unités, comme les unités anglaises ou le système métrique ou encore tenir compte de variantes dans la façon dont les nombres sont épelés en entier.
Le domaine le plus évident dans une localisation est le support des
langues.
C'est là que GNU gettext
procure le moyen aux développeurs et aux
lecteurs de changer facilement la langue qu'utilise le logiciel pour
communiquer avec l'utilisateur.
Ces domaines de conventions culurelles sont appelées les catégorie de localisation. C'est un terme assez malheureux ; les aspects de localisation ou les catégories des fonctionalités de localisation seraient de meilleures expressions, car chaque « catégorie de localisation » décrit un domaine ou une tâche qui demande une localisation. La donnée concrète qui décrit les conventions culturelles pour ce domaine et pour une culture particulière est aussi appelée une catégorie de localisation. En ce sens,une localisation est composée de plusieures catégories de localisation : la catégorie de localisation qui décrit le jeu de caractères,la catégorie de localisation qui décrit le formatage des nombre, la catégorie de localisation qui contient les messages traduits et ainsi de suite…
Les composants de la localisation en dehors de la gestion des messages sont
standardisés le standard C ISO le standard POSIX:2001 (aussi connu comme la
spécificatin SUSV3). La bibliothèque GNU lbc
implémente ceci
totalement et la plupart des autres systèmes modernes procurent un support
plus ou moins raisonable pour au moins une partie des composants
manquants.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Les lettres PO dans les fichiers ‘.po’ signifient Objet Portable pour les dinstinguer des fichiers ‘.mo’, où MO signifie Object pour la Machine. Ce paradigme comme le format des fichiers PO est inspiré du standard SLN (ndt NLS) développé par Uniform et implémenté la première fois par Sun sur leur système Solaris.
Les fichies PO sont prévus pour être lus et édités par des humains et
associent chaque chaîne originale à traduire d'un progiciel donné avec sa
traduction dans la langue cible particulière. Un fichier unique PO et dédié
à une langue cible particulière. Si un progiciel supporte beaucoup de
langues,
il n'y aura qu'un fichier PO de ce type par langue supportée et chaque
progiciel
a son propre jeu de fichiers PO. Ces fichiers PO seront crées
préférentiellemnt
avec le programme xgettext
et par la suite mise à jour ou rajeunis
par
le programme msmerge
. Le programme gettext
extrait tous les
messages marqués dans un groupe de fichiers C et initie un fichier PO avec
des traductions vides. Le programme msmege
fait attention d'ajuster
les fichiers PO entre les différentes révisions des fichiers sources
correspondants, commentant les entrées devenues obsolètes,
initialisant les nouvelles et mettant à jour toutes les références des
lignes sources. Les fichiers terminant par ‘.pot’ sont des sortes
de fichier de traduction de base qui se trouvent dans les distributions
sous le format des fichiers PO.
Les fichiers MO sont prévus pour être lus par des programmes et sont de
nature binaire. Quelques systèmes offrent déjà des outils pour créer et
gérer les fichiers MO comme partie du Support des Langues Nationales venant
avec le système, mais le format de ces fichiers MO est souvent différent
d'un système à l'autre et est non-portable. Les outils déjà fournis avec ces
systèmes ne supportent pas toutes les fonctionalités du programme GNU
gettext
. C'est pourquoi GNU gettext
utilise son propre format
pour les fichiers MO. Les fichiers terminant par ‘.gmo’ sont réellement
des fichiers MO, quand il est su que ces fichiers sont dans le format GNU.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
Le diagamme suivant synthétise les relations entre les fichiers gérés
par GNU gettext
et les outils agissant sur ces fichiers. Il est
suivi par des explications passablement détaillées, que vous devriez lire
en gardant un œil sur ce diagramme. Avoir une compréhension claire de
ces interactions aidera sûrement les programmeurs, les traductrices et
les mainteneurs.
Sources d'origine en C ───> Préparation ───> Sources C marqués ───╮ │ â•─────────<─── bibliothèque GNU gettext │ â•─── make <───┤ │ │ ╰─────────<────────────────────┬────────────────────╯ │ │ │ â•─────<─── PACKAGE.pot <─── xgettext <───╯ â•───<─── PO Compendium │ │ │ ↑ │ │ ╰───╮ │ │ ╰───╮ ├───> éditeur PO ───╮ │ ├────> msgmerge ──────> LANG.po ────>──── ────╯ │ │ â•───╯ │ │ │ │ │ ╰─────────────<───────────────╮ │ │ ├─── nouveau LANG.po <──────────────────╯ │ â•─── LANG.gmo <─── msgfmt <───╯ │ │ │ ╰───> install ───> /.../LANG/PACKAGE.mo ───╮ │ ├───> "Bonjour le monde !" ╰───────> install ───> /.../bin/PROGRAM ───────╯ |
Pour le programmeur, la première étape pour amener GNU gettext
dans
votre progiciel est d'identifer, déjà depuis le fichier source en C, ces
chaînes qui sont prévues pour être traduites et celles qui ne le seront
pas. Ce travail fastidieux peut être fait de manière un peu plus confortable
en utilisant le mode PO d'Emacs, mais vous pouvez utiliser tous les moyens
qui vous sont familliers pour modifier les fichiers sources en C. À côté de
ceci, quelques autres changements simples et standards sont requis pour
initialiser proprement la bibliotèque de traduction.@xref{Sources},pour plus
d'information sur tout ceci.
Pour le nouveaux logiciels en développement, bien sûr les chaînes peuvent et
devraient être marquées pendant qu'on l'écrit. L'approche de gettext
rend ceci très facile. Il suffit de mettre les lignes suivantes au début de
chaque fichier ou dans un fichier d'en-tête centralisé.
#define _(Chaîne) (Chaîne) #define N_(Chaîne) Chaîne #define textdomain(Domaine) #define bindtextdomain(Progiciel, Répertoire) |
Faire ainsi vous permet de préparer les sources pour
l'internationalisation. Plus tard, quand vous vous senterez prêt pour
utiliser la bibliothèque gettext
remplacez simplement ces définitions
par :
#include <libintl.h> #define _(Chaîne) gettext (Chaîne) #define gettext_noop(Chaîne) Chaîne #define N_(Chaîne) gettext_noop (Chaîne) |
et liez de nouveau avec ‘libintl.a’ ou ‘libintl.so’. Notez que sur
les systèmes GNU, nous n'avez pas besoin de lier à libintl
car les
fonctions de la bibliothèque gettext
sont déjà contenues dans la
bibliothèque GNU libc. C'est tout ce que vous avez à changer.
Une fois que les sources C ont été modifiées, le programme xgettext
est utilisé pour trouver et extraire toutes les chaînes à traduire et crée
un fichier PO de référence avec tout ceci. Ce fichier
‘progiciel.pot’ contient toutes les chaînes originales du
programme.Il conserve un jeu de pointeurs désignant exactement où chacune
des
chaînes est utilisées dans le source C. Chaque traduction est vide. La
lettre t
dans le .pot
indique qu'il s'agit d'un fichier PO
modèle, pas encore orienté vers une langue
particulière. Voir la section Invocation le programme msginit
, pour plus de détails sur
comment
on appelle le programme xgettext
. Si vous être vraiment
flémard, vous serez peut être intéressé de travailler beaucoup plus
directement et de préparer la définition de la distribution complète
(@pxref{mainteneurs}).En faisant ainsi, vous vous économisez de taper les
commandes xgettext
car make
devrait générer maintenant les
choses correctement et automatiquement pour vous !
Néanmoins, la première fois il n'y a pas encore de ‘lan.po’, donc
l'étape msmerge
peut être sautée et remplacée par seulement une copie
de ‘package.pot’ vers ‘lang.po’, où lang
représente la langue voulue. Voir @ref{création}< pour les détails.
Vient ensuite la traduction initiales des messages. La traduction par elle même est un sujet complet, uniquement dédiée aux humains, et dont la complexité dépasse largement le niveau de ce manuel. Néanmoins, quelques conseils sont donnés dans d'autres chapitres de ce manuel (@pxref{traducteurs}). Vous trouverez aussi ici des indications sur la façon de contacter les équipes de traduction, ou sur la façon d'en devenir membre pour partager vos soucis de traduction avec d'autres traducteurs qui traduisent vers la même langue nationale que vous.
Quand vous ajoutez un message traduit dans un fichier PO ‘lang.po’, si vous n'utilisez pas un éditeur de fichiers PO dédié (@pxref{édition}), vous devez vous en remettre à vous seul pour que votre effort respecte complètement le format des fichiers PO et les conventions de cotations (@pxref{fichiers PO}). Ce n'est sûrement pas une tâche impossible, car c'est ainsi que beaucoup de personnes ont gérés les fichiers PO autour des années 1995. D'un autre côté, en utilisant un éditeur de fichier PO, la plupart des détails du format des fichiers PO sont traités pour vous, mais vous devez acquérir quelques familiarités avec l'éditeur de fichier PO lui même.
Si quelques traductions communes ont déjà été sauvegardées dans un fichier PO compendium, les traductrices peuvent utiliser le mode PO pour intialiser les entrées non traduites du compendium et aussi d'avoir une sélection de traductions dans la mise à jour du compendium (@pxref{compendium}). Les fichiers compendium ont pour objectif d'être échangés entre les membres d'une équipe de traduction donnée.
Les programmes ou les progiciels sont par nature dynamiques : les utilisateurs écrivent des rapports de bogues et des suggestions d'amélioration, les mainteneurs réagissent en modifiant les programmes de manières diverses. Le fait qu'un progiciel ait déjà été internationalisé ne devrait pas rendre le mainteneur timoré pour ajouter de nouvelles chaînes ou modifier celles qui ont déjà été traduites. Ils font juste leur travail le mieux qu'ils peuvent. Pour que le Projet de Traduction puisse fonctionner sans heurt, il est important que les mainteneurs n'aient pas à porter sur leurs épaules, déjà bien chargées, les préoccupations de traduction et que les traductrices soient aussi libre que possible des préoccupations de programmation.
La seule préoccupation que les mainteneurs devraient avoir est de marquer
consciencieusement les nouvelles chaînes comme traduisibles quand elles le
doivent sans se préoccuper ensuite si elles sont ou non traduite, ce qui
viendra en temps utile. En conséquence, quand les programmes et leurs
chaînes sont ajustés de différentes façons par les mainteneurs et pour des
sujets qui n'ont la plupart du temps aucune relation avec la traduction,
xgettext
construira les fichiers ‘package.pot’ qui
évoluent au cours du temps, quand les traductions faites par
‘lang.po’ deviennent progressivement obsolètes.
Il est important pour les traductrices (et même les mainteneurs) de comprendre que la traduction des progiciels un processus continu dans la durée de vie d'un progiciel et non pas quelque chose qui est fait une fois pour toute au début.Après l'explosion initiale de l'activité de traduction pour un progiciel donné, des interventions sont nécessaires de temps en temps, car par ci par là , les entrées traduites deviennent obsolètes et de nouvelles entrées non-traduites apparaissent et qui ont besoin de traduction.
Le programme msmerge
a pour objectif de rafraîchir un fichier
‘lang.po’ déjà existant en le comparant avec un nouveau fichier
modèle ‘package.pot’ extrait par xgettext
du source C
récent. L'opération de rafraîchissement ajuste toutes les références aux
positions des chaines dans les fichiers sources C, car ces chaînes bougent
quand les programmes sont changés. msgmerge
commente aussi comme
obsolète dans le fichier ‘lang.po’, ces entrées déjà traduites
qui ne sont plus utilisées dans les sources du programme (@pxref{entrées
obsolètes}). Il découvre finalement de nouvelles chaînes et les insèrent
dans le fichier PO résultant comme des entrées non traduites (@pxref{entrées
non traduites}). @xref{invocation de msgmerge}, pour avoir plus
d'informations sur ce que msgmerge
fait réellement.
Quelque soit la route ou moyens pris, la but est d'obtenir un fichier ‘lang.po’ mise à jour donnant des traductions pour toutes les chaînes.
La mobilité temporelle, ou fluidité, des fichiers PO est une part intégrale du jeu de traduction et doit être bien comprise et acceptée. Les personnes qui y resistent auront du mal à participer à un Projet de Traduction ou donneront du mou à retordre aux autres participants ! En particulier, les mainteneurs devraient rester souples et inclure dans distriutions tous les fichiers PO officiels disponibles, même s'ils ne n'ont pas été mis à jour récemment et ne pas exercer de pression sur les équipes de traduction pour obtenir un travail achevé. La pression devrait plutôt venir de la communauté des utilisateurs parlant une langue particulière et les mainteneurs devraient se considérer eux mêmes plutot déchargés de tous soucis sur l'adéquation des fichiers de traduction. D'un autre côté, les traducteurs devraient essayer de mettre à jour raisonnablement les fichiers PO dont ils sont responsables, quand le progiciel est en train d'être pré-testé, avant une distribution officielle.
Une fois qu'un fichier PO est complet et fiable, le programme msgfmt
est utilisé pour traduire le fichier PO en un format oritenté machine, qui
peut permettre de récupérer les traductions d'un programme, à chaque fois
que c'est demandé pendant l'exécution (@pxref{MO Files}). @ref{msgfmt
Invocation}, pour plus d'information sur tous les modes d'exécution du
programme msgfmt
.
Finalement les sources C modifiés et marqués sont compilés et liés avec la
bibliothèque GNU gettext
, habituellement par la commande make
en supposant qu'un ‘Makefile’ adéquat existe pour le projet. Les
fichiers exécutables résultants sont installés à un endroit ou les
utilisateurs vont les trouver. Les fichiers MO eux mêmes doivent être
installés correctement. En supposant que les variables d'environnement sont
définies (@pxref{définir la localisation POSIX}), le programme devrait se
localiser de lui même automatiquement, à chaque fois qu'il s'exécute.
La suite de ce manuel a pour objet l'explication en profondeur des différentes étapes que nous avons décrites ci dessus.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
De nos jours, quand les utilisateurs démarrent leur session sur un ordinateur, ils trouvent habituellement que tous leur programmes envoient des messages dans leur langue nationale — au moins pour les utilisateurs dont les langues ont une communauté active pour les logiciels libres, comme les français ou les allemands et dans une moindre mesure, des langues avec une plus faible participation dans les logiciels libres et le projet GNU, comme l'hindi et le philipin.
Comment cela fonctionne ? Commet l'utilisateur peut-il influencer la langue qui est utilisée par les programmes ? Ce chapitre va y répondre.
2.1 Installation du système d'exploitation | Les questions pendant l'installation des programmes | |
2.2 Régler la localisation utilisée par les programmes avec interface utilisateur graphique | Comment spécifier les paramètres de localisation des programmes avecinterface utilisateur graphique | |
2.3 Définir la localisation par les variables d'environnement | Comment spécifier les variables de localisation selon la norme POSIX | |
2.4 Installing Translations for Particular Programs | Comment installer des traductions supplémentaires |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
La langue par défaut est souvent déjà spécifiée pendant l'installation du système d'exloitation. Quand le système d'exploitation est installé, le programme d'installation demande habituellement la langue à utiliser pour le processus d'installation lui même et séparément, pour la langue à utiliser sur le système installé. Quelques programmes d'installation ne demande la langue qu'une seule fois.
Ceci détermine la langue par défaut sur tout le système pour tous les utilisateurs.Mais les installateurs donnent souvent la possibilité d'installer des localisationset langues additionnelles. Par exemple, les localisations sur KDE (l'environnementde bureau K) ou sur OpenOffice.org sont souvent empaquettés séparément comme unpaquatage installable par langue.
À ce point, est il est bon de considérer l'utilisaton voulue de la machine : si c'est une machine à vocation d'utilisation personnelle, les localisations additionnelles ne sont sûrement pas nécessaires. Par contre si la machine doit être utilisée dans une organisation ou une entreprise qui a des relations internationales, on peut alors prendre en compte le besoin d'utilisateurs hôtes. Si vous avez un hôte venant de l'étranger pour une semaine, quelles seraient ses localisatons préférées ? Cela peut valoir le coup d'installer ces localisations additonnelles en avance, car elles ne coûtent qu'un peu d'espace disque à cette étape.
La langue par défaut de tout le système et la configuration locale, qui estutilisée quant un compte utilisateur est crée. Mais l'utilisateur peut avoirses propres configurations de localisation, qui sont différentes de cellesd'autres utilisateurs sur la même machine. Il peut le spécifier, typquementaprès sa prémière connexion sur le système, comme décrit dans la section suivante.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Les programmes immédiatement disponibles sur le bureau de l'utilisateur formentun groupe de programmes appelé « environnement du bureau » ; il inclue habituellementle gestionnaire de fenêtres, un navigateur internet, un éditeur de texte et plus… Les environnements de bureau libres les plus communs sont KDE, GNOME et XFce.
La localisation utilisée par un programme avec interface utilisateur graphique (IUG)sur une environnement de bureau peut être spécifié par un écran de configurationappelé « centre de contrôle », « réglage de la langue » ou « réglage du pays ».
Les programmes IUG individuels, qui ne font pas partie de l'environnement de bureau, peuvent avoir leur propre localisation spécifiée soit dans le panneau de contrôle, soit dans les variables d'environnement.
Pour quelques programmes, il est possible de définir la localisation Ã
traversdes variables d'environnement, parfois même avec une autre
localisation que celledu bureau lui même. Ceci signifie que vous pouvez
démarrer ce programme depuis laligne de commande, après avoir défini
quelques variables d'environnement à laplace de démarrer le programme par
un menu ou par le système de gestion de fichiers.Les variables
d'environnement peuvent être celles qui sont spécifiées dans la
sectionsuivante (@ref{définir la localisation POSIX}); cependant pour
quelques versions de KDE, la localisation est définie par la variable
KDE_LANG
plutôt que LANG
ouLC_ALL
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Comme utilisateur, si votre langue a été installée pour ce progiciel, dans
le cas le plus simple, vous n'avez qu'à définir la variable d'environnement
LANG
sur la combinaison appropriée de ‘ll_CC’. Par
exemple, supposons que vous parliez allemand et viviez en Allemagne. À
l'invite du Shell, exécutez simplement ‘setenv LANG de_DE’ (dans
csh
), ‘export LANG; LANG=de_DE’ (dans sh
)ou
‘export LANG=de_DE’ (dans bash
). Ceci peut être fait dans
votre fichier ‘.login’ ou ‘.profile’ une bonne fois pour toutes.
2.3.1 Noms de localisations | À quoi ressemble une spécification de localisation | |
2.3.2 Variables d'envrionnement pour la localisation | Quelle variable d'environnement spécifie quoi | |
2.3.3 Spécification d'une liste de priorité des langues | Comment spécifier une liste de priorité des langues |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Un nom de localisation a habituellement la forme ‘ll_CC’.
Ici ‘ll’ est code à deux lettres donnant la langue selon
la normeISO 639 et ‘CC’ est la code à deux lettres donnant
le pays selon la norme ISO 3166. Par exemple, pour l'allemand en
allemagne, ll est de
et CC est DE
. Vous trouverez
une liste de codes de langues dans l'appendice @ref{codes des langues} et
la liste des codes des pays dans l'appendice @ref{codes des pays}.
Vous pourriez penser que la spécifications des codes des pays est redondante.Mais en fait, certaines langues ont des dialectes selon les pays. Par exemple, ‘de_AT’ est utilisé pour l'Autriche et ‘pt_BR’ pour le Brésil.Le code du pays permet de distinguer ces différences.
Beaucoup de noms de localisation ont une syntaxe étendue ‘ll_CC.encoding’ qui spécifie aussi l'encodage des caractères. Ceci est utilisé car entre 2000 et 2005, la plupart des utilisateurs sont passés aux localisations avec un encodage UTF-8. Par exemple, la localisation allemande sur les systèmes glibc est maintenant ‘de_DE.UTF-8’. L'ancien nom ‘de_DE’ refère toujours à la localisation allemande comme en 2000, qui sauvegardait les caractères dans l'encodage ISO-8859-1, un encodage de texte qui ne peut même pas accomoder le signe monétaire de l'Euro.
Quelques noms de localisation utilise ‘ll_CC.@variante’ à la place de ‘ll_CC’. La ‘@variante’ peut signifier n'importe quelle caractéristique qui n'est pas déjà impliquée par la langue ll et le pays CC. Ceci peut indiquer une unité monétaire particulière. Par exemple sur les systèmes glibc, ‘de_DE@euro’ signifie une localisation qui utilise la monaie Euro, en contraste avec l'ancienne localisation ‘de_DE’ qui implique l'utilisation de l'ancienne monnaie d'avant 2002. Ceci peut aussi signifier un dialecte de la langue ou un type d'écriture utilisé pour écrire le texte (par exemple, ‘sr_RS@latin’ utilise l'écriture latine, alors que ‘sr_RS’ utilise l'écriture cyrillique pour écrire le serbe) ou encore des règles orthographiques ou tout autre chose similaire.
Sur d'autres systèmes, quelques variations de ce schéma sont utilisées, comme ‘ll’. Vous pouvez obtenir la liste des localisationssupportées par votre système pour votre langue en lançant la commande‘locale -a | grep '^ll'’.
Il y a aussi une localisation spéciale, appelée ‘C’. Quand elle est utilisée, elle desactive toutes les localisations : dans cette localisation, tous les programmes standardisés par POSIX utilisent des messages en anglais et un encodage de caractères non spécifié (souvent US-ASCII, mais quelque fois aussi ISO-8859-1 ou UTF-8, selon le système d'exploitation).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
A locale is composed of several locale categories, see @ref{Aspects}. When a program looks up locale dependent values, it does this according to the following environment variables, in priority order:
LANGUAGE
LC_ALL
LC_xxx
, selon la catégorie locale sélectionnée :
LC_CTYPE
, LC_NUMERIC
, LC_TIME
, LC_COLLATE
,
LC_MONETARY
, LC_MESSAGES
, ...
LANG
Les variables dont les valeurs sont définies, mais vides, sont ignorées dans cet aperçu.
LANG
est la variable d'environnement normale pour spécifier une
localisation.Comme utilisateur, vous réglez normalement cette variable (sauf
si d'autres variables ont déjà été réglées par le système, dans
‘/etc/profile’ ou des fichiers d'initialisation similaires).
LC_CTYPE
, LC_NUMERIC
, LC_TIME
, LC_COLLATE
,
LC_MONETARY
, LC_MESSAGES
, etc… sont les variables
d'environnement prévues pour remplacer LANG
et affecter une catégorie
locale unique. Par exemple, supposons que vous êtes un utilisateur Suédois
en Espagne et que vous voulez voulez que vos programmes gèrent les nombres
et les dates selon les conventions espagnoles et seulement les messages
devraient utiliser le Suédois. Alors vous pourriez créer une localisation
nommée ‘sv_ES’ ou ‘sv_ES.UTF-8’ en utilisant le programme
localedef
. Mais il est plus simple et cela donne le même effet de
définir la variable LANG
à es_ES.UTF-8
et la variable
LC_MESSAGES
à sv_SE.UTF-8
; ces deux localisations arrivent
pré-installées avec le système d'explotation.
LC_ALL
est la variable d'environnement qui les remplace toutes. C'est
habituellement utilisé dans les scripts qui lancent des programmes
particuliers.Par exemple, le script configure
généré par GNU autoconf
utilise LC_ALL
pour s'assurer que les tests de configuration
n'opèreront en dépendant de la localisation.
Malheureusement quelques systèmes définissent LC_ALL
dans le fichier
‘/etc/profile’ ou dans des fichiers d'initialisation similaires. Comme
utilisateurs, vous devez donc ré-intialiser cette variable si vous voulez
régler LANG
et optionnellement quelques unes des autres variables
LC_xxx
.
La variable LANGUAGE
est décrite dans la prochaine sous section.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Tous les programmes n'ont pas de traduction pour toutes les langues. Par
défaut, un message en anglais est montré à la place des traductions
non-existante. Si vous comprenez les autres langues, vous pouvez définir un
liste de priorité des langues. Ceci est fait à travers différentes variables
d'environnement appelées LANGUAGE
. GNU gettext
donne la
préférence à LANGUAGE
sur LC_ALL
et LANG
pour la
gestion des messages, mais vous avez toujours besoin d'avoir LANG
(or
LC_ALL
) réglée sur la première langue ; ceci est requis par les
autres parties des bibliothèques systèmes.Par exemple, des utilisateurs
suédois liraient plutôt des traductions en allemand qu'en anglais quand le
suédois n'est pas disponible, ils définiraient alors LANGUAGE
Ã
‘sv:de’ en laissant LANG
sur ‘sv_SE’.
Conseil spécifique pour les utilisateurs norvégiens : le code
linguistique pour le glyphe norvégien bokmål à changé de
‘no’ à ‘nb’ récemment (en 2003).Pendant cette période de
transition, alors que certains catalogues de message pour cette langue ont
été installé avec ‘nb’ et d'autres plus anciens sous ‘no’, il est
recommandé pour les utilisateurs norvégiens de définir LANGUAGE
Ã
‘nb:no’ de telle façon que les nouvelles et les anciennes traductions
soient utilisées.
Dans la variable d'environnement LANGUAGE
, mais pas dans les autres
variables d'environnement, les combinaisons ‘ll_CC’ peuvent
être abrégées en ‘ll’ en signifiant le dialecte principale de la
langue. Par exemple, ‘de’ est équivalent à ‘de_DE’ (L'allemand
comme parlé en Allemagne) et ‘pt’ pour ‘pt_PT’ (le portugais comme
parlé au Portugal) dans ce contexte.
Note : la variable LANGUAGE
est ignorée si la localisation est réglée
à ‘C’.En d'autres termes, vous devez d'abord activer la localisation en
définissant LANG
(ou LC_ALL
) à une valeur différente de
‘C’, avant de pouvoir utiliser une liste de priorité des langues Ã
travers la variable LANGUAGE
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Les langues ne sont pas supportées de manières égales dans tous les
progiciels utilisant gettext
et plus de traductions sont ajoutées au
cours du temps.Habituellement, vous utilisez les traductions qui sont
livrées avec le système d'exploitation ou avec des progiciels particuliers
que vous avez installés par après. Mais vous pouvez aussi installer de
nouvelles localisations directement.Pour faire ceci, vous aurez besoin de
comprendre où sont stockés chaque fichier de localisation sur le système
d'exploitation.
Pour les programmes qui participent au Projet de Traduction, vous pouvezcommencer à chercher des traductions ici :http://translationproject.org/team/index.html. Un extrait de cette information peut aussi être trouvé dans le fichier ‘ABOUT-NLS’ qui est livré avec GNU gettext.
Pour les programmes qui font partie du projet KDE, le point de départ est :http://i18n.kde.org/.
Pour les programmes qui font partie du projet GNOME, le point de départ est :http://www.gnome.org/i18n/.
Pour les autres programmes, vous pouvez vérifier si le paquet du code source du programme contient quelques fichiers ‘ll.po’; souvent, ils sont gardés ensemble dans un répertoire appelé ‘po/’. Chaque fichier ‘ll.po’ contient les traductions de messages pour la langue dont l'abréviation est ll.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
La suite d'outils GNU gettext
aide les programmeurs et les
traductrices pour produire, mettre à jour et utiliser les fichiers de
traduction, principalement les fichiers PO, qui sont des fichiers textuels
et éditables. Ce chapitre explique le format des fichiers PO.
Un fichier PO est fait de beaucoup d'entrées, chacune d'entre elle gardant une relation entre une chaîne originale non traduite et sa traduction correspondante.Chaque entrée d'un fichier PO font normalement partie d'un seul projet et toutes les traductions sont exprimées dans une seule langue cible. Un entrée d'un fichier PO a la structure schématique suivante :
white-space # translator-comments #. extracted-comments #: reference… #, flag… #| msgid previous-untranslated-string msgid untranslated-string msgstr translated-string |
La structure générale d'un fichier PO devrait être bien comprise par la traductrice. En utilisant le mode PO, peu de choses doivent être sûes sur les détails du format, car le mode PO fait attention à eux pour elle.
Un entrée simple peut ressembler à ceci :
#: lib/error.c:116 msgid "Unknown system error" msgstr "Error desconegut del sistema" |
Les entrées débutent par quelques espaces blancs
optionnels. Habituellement, quand la génération est faite par les outils GNU
gettext
, il y a exactement une ligne blanche entre les
entrées. Ensuite suivent les commentaires, sur des lignes qui commencent
toutes par le caractère #
. Il y a deux sortes de commentaires :
ceux qui ont des espaces blancs suivant immédiatement le #
-les
commentaires de la traductrice - et qui sont des commentaires crées
et maintenus exclusivement par la traductrice et ceux qui ont des
caractères non-blancs suivant immédiatement le #
— les
commentaires automatiques — et qui sont des commentaires crées et
maintenus automatiquement par les outils GNU gettext
. Les lignes de
commentaires débutant par #.
contiennent des commentaires donnés par
le
programmeur, dirigés vers la traductrice ; ces commentaires sont appelés
commentaires extraits
car le programme gettext
les extrait du
code source du programme. Les lignes de commentaires débuttant avec
#:
contiennent des références au code source du programme. Les
lignes de commentaires débutant par #,
contiennent des marques ; on
en dira plus plus loin. Les lignes de commentaires débutant par #|
contiennent les chaînes précédemment non traduites pour lesquelles la
traductrice a donné une traduction.
Tous les commentaires, de quelque sortes qu'ils soient, sont optionnels.
les entrées montrent deux chaînes, nommément la première donne la chaîne non
traduite telle qu'elle apparaît dans le source original du programme et
ensuite la traduction de cette chaîne.La chaîne originale est introduite par
le mot clé msgid
et la traduction par msgstr
. Les deux
chaînes, la non-traduite et la traduite, sont entre guillemets de
différentes façons dans le fichier PO, en utilisant les délimiteurs
\"
et les échappements \\
, mais le traducteur n'a pas vraiment
à faire attention pour préciser la forme du format des guillemets car le
mode PO les gère pour elle.
Les chaînes msgid
, comme les commentaires automatique, sont produits
et gérés par les autres outils GNU gettext
et le mode PO ne donne pas
de moyens à la traductrice pour les altérer. Tout ce qu'elle peut faire est
simplement de les effacer et seulement en enlevant l'entrée entière. D'un
autre côté, la chaîne msgstr
, comme les commentaires de la
traductrice, sont vraiment pour la traductrice et le mode PO lui donne le
contrôle dont elle a besoin.
Les lignes de commentaires débutant par #,
sont spéciales car elles
ne
sont pas complètement ignorées par le programme comme le sont généralement
les commentaires. La liste séparée par des virgules des marques est
utilisée par le programme msgfmt
pour donner à l'utilisateur des
messages de diagnostiques améliorés. Il y a couramment deux formes de
marques définies :
fuzzy
msgmerge
ou peut être inséré par la traductrice elle même. Il montre
que la chaîne msgstr
peut ne pas être une traduction correcte (ne
plus l'être). Seule la traductrice peut juger si une traduction nécessitera
d'autres modifications ou si elle est acceptable. Une fois satisfaite avec
la traduction, elle peut enlever cet attribut fuzzy
. Le programme
msgmerge
insère ceci quand il combine les entrées msgid
et
msgstr
seulement après une recherche floue (ndt
fuzzy). @xref{entrées floues -ndt fuzzy}.
c-format
no-c-format
par un humain. À la place, seul le programme xgettext
les ajoute.
Avec un système automatisé de traitement des fichiers PO, comme proposé
ici, les changements de l'utilisateur seront rejettés dès que le programme
xgettext
génèrera un nouveau fichier modèle.
La marque c-format
dit que la chaînes non-traduite et la traduction
sont supposées être des chaînes en format C. La marque no-c-format
dit qu'elles ne sont pas des chaînes en format C, même s'il s'avère que la
chaîne non-traduite ressemble à une chaîne en format C (avec des directives
‘%’).
Au cas où la marque c-format
est donné pour une chaîne le programme
msgfmt
fera d'autres tests pour vérifier la validité de la
traduction. @xref{invocation de msgfmt}, @ref{marque c-format} et
Format des chaînes en C.
format objc
no-objc-format
De façon identique pour Objective C, voir @ref{format-objc}.
format Shell
no-sh-format
@ref{format-sh}.
format python
no-python-format
De façon identique pour Python, voir @ref{format-python}.
format Lisp
no-lisp-format
@ref{format-lisp}.
format ELisp
no-elisp-format
De façon identique pour Emacs Lisp, voir @ref{format-elisp}.
format librep
no-librep-format
De façon identique pour librep, voir @ref{format-librep}.
format scheme
no-scheme-format
De façon identique pour Scheme, voir @ref{format-scheme}.
format smalltalk
no-smalltalk-format
De façon identique pour Smalltalk, voir @ref{format-smalltalk}.
format java
no-java-format
De façon identique pour Java, voir @ref{format-java}.
format csharp
no-csharp-format
De façon identique pour C#, voir @ref{format-csharp}.
format awk
no-awk-format
De façon identiquer pour awk, voir @ref{format-awk}.
format object pascal
no-object-pascal-format
De façon identique pour Object Pascal, voir @ref{format-object-pascal}.
format ycp
no-ycp-format
De façon identique pour YCP, voir @ref{format-ycp}.
format tcl
no-tcl-format
De façon identique Tcl, voir @ref{format-tcl}.
format perl
no-perl-format
DE façon identique pour Perl, voir @ref{format-perl}.
perl-brace-format
no-perl-brace-format
De façon identique pour les parenthèses Perl, voir @ref{format-perl}.
format php
no-php-format
De façon identique pour PHP, voir @ref{format-php}.
format interne de gcc
no-gcc-internal-format
De façon identique pour les sources GCC, voir @ref{format-interne-gcc}.
format qt
no-qt-format
De façon identique pour Qt, voir @ref{format-qt}.
format kde
no-kde-format
De façon identique pour KDE, voir @ref{format-kde}.
format boost
no-boost-format
De façon identique pour Boost, voir @ref{format-boost}.
Il est aussi possible d'avoir des entrées avec un spécificateur du contexte. Ellesressemblent à ceci :
white-space # commentaire-traductrice #. commentaires-extraits #: référence… #, marque… #| msgctxt context-précédent #| msgid chaîne-non-traduite-précédente msgctxt contexte msgid chaîne-non-traduite msgstr chaîne-traduite |
Le contexte sert à lever les ambiguïtés des messages avec la même
chaîne non-traduite. Il est possible d'avoir plusieurs entrées avec la
même chaîne non-traduite dans un fichier PO, en supposant qu'elles ont
chacune un contexte différent. Notez qu'une chaîne vide de
contexte et que l'absence d'une ligne msgctxt
ne signifient pas
la même chose.
Une forme différente d'entrées est utilisée pour les traductions qui impliquent des formes plurielles.
white-space # translator-comments #. extracted-comments #: reference… #, flag… #| msgid previous-untranslated-string-singular #| msgid_plural previous-untranslated-string-plural msgid untranslated-string-singular msgid_plural untranslated-string-plural msgstr[0] translated-string-case-0 ... msgstr[N] translated-string-case-n |
Une telle entrée ressemble à ceci :
#: src/msgcmp.c:338 src/po-lex.c:699 #, c-format msgid "found %d fatal error" msgid_plural "found %d fatal errors" msgstr[0] "s'ha trobat %d error fatal" msgstr[1] "s'han trobat %d errors fatals" |
Ici aussi, un contexte msgctxt
peut être spécifié avant msgid
comme vu plus haut.
La chaîne non-traduite (ndt previous-untranslated-string)peut
éventuellement être insérée par le programme msgmerge
en même temps
qu'il marque une traduction fuzzy (-ndt floue). Cela aide la traductrice Ã
voir quels ont été les changements qui ont été faits par les développeurs
sur la chaîne non-traduite.
Il arrive que certaines lignes, habituellement des espaces blanc ou des commentaires suivent la dernière entrée d'un fichier PO. De telles lignes ne font partie d'aucune entrée et seront rejetées quand le fichier PO sera traité par les outils, ou cela pourrait aussi déranger certains éditeurs de fichiers PO.
Le reste de cette section peut être sauté sans risque pour ceux qui utilisent un éditeur de fichier PO, bien qu'elle puisse être intéressante pour toute personne désireuse d'avoir une meilleure idée du format précis d'un fichier PO.Par contre les personnes désirant modifier des fichiers PO à la main devraient continuer de lire avec attention.
Chacune des chaînes non-traduites et des chaînes traduites respectent la syntaxe C pour les chaînes de caractères, incluant les guillemets encadrantes et les séquences d'échappement par barres obliques inversées incorporées. Quand il faut écrire des chaînes à lignes multiples, on ne devrait pas utiliser les nouvelles lignés échappées. À la place, des guillemets fermantes devraient suivre le dernier caractère sur la ligne à continuer et des guillemets ouvrantes devraient reprendre la chaîne au début de la prochaine ligne du fichier PO. Par exemple :
msgid "" "Here is an example of how one might continue a very long string\n" "for the common case the string represents multi-line output.\n" |
Dans cet exemple, la chaîne vide est utilisée sur la première ligne pour
permettre un meilleur alignement du I
du mot ‘Ici’ sur le
p
du mot ‘pour’. Dans cet exemple, le mot clé msgid
est
suivi de trois chaînes, qui doivent être concaténées. La concaténation de la
ligne vide ne change pas la chaîne générale résultante , mais c'est un moyen
pour nous de satisfaire la nécessité pour msgid
d'être suivi par une
chaîne sur la même ligne, tout en conservant la présentation sur plusieurs
lignes justifiées à gauche car nous trouvons que c'est une disposition plus
claire. La chaîne vide aurait pu être omise, mais seulement si la chaîne
débutant par ‘Ici’ aurait été promue sur la première ligne, juste après
msgid
.(2) Il n'était pas vraiment nécessaire de changer
entre les deux dernières chaînes entre guillemets, immédiatement après la
nouvelle ligne ‘\\n’, le changement aurait pu avoir lieu après
n'importe quel autre caractère, nous l'avons juste fait de cette
façon pour être plus propre.
On devrait distinguer avec soin entre des fins de ligne marquées par ‘\\n’à l'intérieur des guillemets, qui font parties de la chaîne représentée,et les fins de lignes dans le fichier PO lui même, à l'extérieur des guillemets,qui n'ont aucune incidence sur la chaîne représentée.
À l'extérieur des chaînes, les lignes blanches et les commentaires peuvent
être utilisées librement. Les commentaires commence au début d'une ligne
avec un ‘#’ et s'étendent jusqu'à la fin de ligne du fichier PO. Les
commentaires qui sont écrits par les traductrices devraient avoir le signe
‘#’ initial suivi immédiatement par quelques espaces blancs. Si le
‘#’ n'est pas immédiatement suivi par un espace blanc, ce commentaire a
vraisemblablement été généré et géré par les outils GNU spécialisés et peut
disparaître ou être remplacé de manière impromptue quand le fichier PO est
donné à msgmerge
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Pour le programmeur, les changements dans le source C peuvent être classés
en trois catégories. Premièrement, vous devez porter à la connaissance de
tous les modules ayant besoin de traduction de messages les fonctions de
localisation.Deuxièmement, vous devriez activer activer correctement les
opérations de GNU gettext
quand le programmes n'initialise,
habituellement avec par la fonction main
. Enfin, vous devriez
identifier, ajuster et marquer toutes les chaînes constantes qui auront
besoin de traduction dans votre programme.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
Dans l'hypothèse que votre ensemble de programmes, ou progiciels, ont été
ajustés
de telle façon que tous les fichiers GNU gettext
sont disponibles et
que vos
fichiers ‘Makefile’ sont ajusté (@pxref{mainteneurs}), chaque module C
qui a
des chaînes traduites en C devraient avoir la ligne :
#include <libintl.h> |
De manière similaire, chaque module C contenant des appels
printf()
/fprintf()
/...avec des chaînes formatées qui
pourraient être traduites en chaîne C (même si les chaînes C viennent d'un
module différent) devraient contenir la ligne :
#include <libintl.h> |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
L'initialisation des données locales devraient être faite avec à peu prêt le même code dans chaque programme, comme démontré plus bas :
int main (int argc, char *argv[]) { … setlocale (LC_ALL, \"\"); bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE); … } |
PACKAGE et LOCALEDIR devraient être fournies soit par
‘config.h’ soit par la Makefile. Pour le moment, consultez les sources
de gettext
ou de hello
pour plus d'informations.
L'utilisation de LC_ALL
peut ne pas être approprié pour
vous. LC_ALL
inclut toutes les catégories locales et spécialement
LC_CTYPE
. Cette dernière catégorie est responsable de la
détermination des classes de caractères avec isalnum
etc… les
fonctions de ‘ctype.h’, qui pourraient être non adéquates,
spécifiquement pour les programme, qui traitent des sortes d'entrée pour les
langues. Par exemple, ceci voudrait dire qu'un code source utilisant le
ç (le caractère c-cédille) peut fonctionner en France mais pas aux
États-Unis.
Quelques systèmes d'exploitation ont aussi des problèmes avec la
reconnaissance syntaxique des chiffres qui utilise la fonction scanf
si une autre catégorie que LC_ALL
est utilisée. Les standards disent
que des formats additionnels excepté celui en local \"C\"
peuvent
être reconnus. Mais certains systèmes semblent rejeter les nombres qui sont
dans le format locale \"C\"
. Dans certaines situations, il peut aussi
y avoir un problème avec la notation elle même, qui rend impossible de
savoir reconnaître si le nombre dans la locale \"C\"
ou dans le
format local. Ceci peut arriver sur des milliers de caractères de séparation
sont utilisés. Quelques localisations définissent ce caractère selon la
convention nationale du '.'
, qui est le même caractère utilisé par la
localisation \"C\"
pour noter le point séparant les décimaux.
Donc il est parfois nécessaire de remplacer la ligne LC_ALL
dans le
code ci dessus par une séquence de lignes setlocale
{ … setlocale (LC_CTYPE, ""); setlocale (LC_MESSAGES, ""); … } |
Sur tous les systèmes conformes à la norme POSIX, les catégories locales
LC_CTYPE
, LC_MESSAGES
, LC_COLLATE
, LC_MONETARY
,
LC_NUMERIC
et LC_TIME
sont disponibles. Sur certains
systèmes, qui ne sont que compatibles à l'ISO C, LC_MESSAGES
est
absent, mais un substitut est défini dans la bibliothèque GNU gettext
<libintl.h>
et la bibliothèque GNU <locale.h>
.
Notez que changer le LC-TYPE
affecte aussi les fonctions déclarées
dans l'en-tête standard <ctype.h>
et quelques fonctions déclarées
dans les en-têtes standards <string.h>
et <stdlib.h>
. Si ceci
n'est pas souhaitable pour votre application (par exemple l'outil de
reconnaissance syntaxique d'un compilateur), vous pouvez utiliser un
ensemble de fonctions de substitutions qui ont la localisation C codé en
dur, comme on le trouve dans les modules ‘c-ctype’,‘c-strcasestr’,
‘c-cstrtod’, ‘c-strtold’ dans la distributiondes sources de GNU
gnulib.
Il est aussi possible de changer la localisation et de revenir entre la
localisation dépendant de l'environnement et la localisation C, mais cette
approche est normalement évitée car les appels à setlocale
sont
coûteux, parce qu'il est fastidieux de déterminer les positions où le
changement de la localisation est requis dans le source d'un gros programme
et parce que change de localisation n'est sécurisé pour les fonctionnements
en multi-activité (ndt multithread).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Avant qu'une chaîne puisse être marquée pour des traductions, elles ont parfois besoin d'être ajustées. Habituellement la préparation des chaînes pour la traduction est faite juste avant de les marquer, phase de marquage qui va être décrite dans la section suivante. Vous devez garder ceci en mémoire pour la suite.
Regardons quelques exemples d'application de ces conseils.
Les chaîne à traduire devraient être dans un style anglais correct. Si de l'argot avec des abréviations et des ellipses sont utilisés, la plupart du temps les traductrices ne comprendront pas les messages et produiront des traductions très inapropriées.
"%s: est paramètre \n" |
Ceci est presque intraduisible : est-ce que l'item affiché est un paramètre ou le paramètre ?
"pas de correspondance" |
L'ambiguïté de ce message le rend inintelligible : est-ce que le programme cherche de définir quelque chose dans le feu de l'action ? Est-ce que cela signifie "l'objet donné ne correspond pas au modèle"? Est-ce que cela signifie "le modèle ne correspond pas pour aucun de ces objets" ?
Dans les deux cas, ajouter plus de mots dans le message aiderait à la fois la traductrice et l'utilisateur de langue anglaise.
Les chaînes à traduire devraient être des phrases entières. Il est souvent impossible de traduire des verbes ou des adjectifs isolés de manière convenable.
printf ("Le fichier %s est %s protégé", filename, rw ? "write" : "read"); |
Le plupart des traductrices ne regarderont pas le source et verront
simplement la chaîne
"Le fichier %s est %s protégé"
, qui est inintelligible. Changez ceci
en
printf (rw? "Le fichier %s est protégé contre l'écriture" : "Le fichier %s est protégé contre la lecture", filename: |
De cette façon la traductrice ne fera pas que comprendre le message, elle sera aussi capable de trouver la construction grammaticale appropriée. Un traductrice française par exemple traduit "write protected" en 'protégé contre l'écriture".
Les phrases entières sont aussi importantes, car dans beaucoup de langues, la déclinaison de certains mots dans la phrase dépends du genre ou du nombre (singulier/pluriel) d'une autre partie de la phrase. Il y a habituellement plus d'inter-dépendance entre les mots qu'il y a en anglais. En conséquence, demander à une traductrice de traduire deux êmoitiés de phrase et les combiner ensuite en une seule phrase ne marchera pas pour beaucoup de langue, même si cela marcherait en anglais. C'est pourquoi les traductrices doivent travailler sur des phrases entières.
Souvent les phrases ne rentrent pas sur une seule ligne. Si une phrase est
sortie
en utilisant deux expressions printf
successives comme celles ci :
printf ("Locale charset \"%s\" is different from\n", lcharset); printf ("input file charset \"%s\".\n", fcharset); |
la traductrice devrait traduire les deux moitiés de phrases, mais rien
dans le fichier PO ne lui dira que les deux moitiés de phrases vont
ensemble.
Il est nécessaire de fusionner les deux expressions printf
,
de telle manière que la traductrice puisse travailler sur la phrase entière
en une seule fois et décider à quel endroit il faut insérer une rupture
de ligne dans la traduction (s'il y en a une) :
printf ("Locale charset \"%s\" is different from\n\ input file charset \"%s\".\n", lcharset, fcharset); |
Vous pourriez maintenant demander : et comment fait-on avec les phrases adjacentes ? Comme dans ce cas :
puts ("Apollo 13 scenario: Stack overflow handling failed.");\n" puts ("On the next stack overflow we will crash!!!"); |
Ces deux expressions doivent-elles être fusionnées en une seule ? Je recommanderais de les fusionner, si les deux phrases ont une relation entre elles, car cela facilite la compréhension de la traductrice pour les traduire toutes deux. D'un autre côté, si l'un des messages est stéréotypé et qu'il intervient à d'autres endroits, vous faciliterez la tâche de la traductrice en ne les fusionnant pas. (les messages identiques qui interviennent à plusieurs endroits sont combinés par xgettext, de telle manière que la traductrice ne les travaille qu'un seule fois).
les chaînes traduisibles devrait être limitées à un paragraphe ; ne laisser pas un message devenir plus long que dix lignes. La raison en est que quand la chaîne à traduire change, la traductrice doit alors modifier la chaîne traduite entièrement. Peut-être seulement un mot aura changé dans la chaîne en anglais, mais la traductrice ne le voit pas (avec les outils de traduction actuels), c'est pourquoi elle devra vérifier le message en entier.
Beaucoup de programme GNU ont une sortie ‘--help’ qui s'étend sur plusieurs pages. C'est une courtoisie envers la traductrice de scinder ces messages en plusieurs pages de cinq à dix lignes chacunes. Ce faisant, vous pouvez aussi scinder les options documentées en groupes, comme le groupe des options d'entrée, celui des options de sortie et celui des options de sorties informatives. Cela aidera tous les utilisateurs à trouver l'option qu'ils recherchent.
On utilise parfois des concaténations de chaînes codées en dur pour construire des phrases anglaise :
strcpy (s, "Replace "); strcat (s, object1); strcat (s, " with "); strcat (s, object2); strcat (s, "?"); |
Pour ne présenter que des phrases entières à la traductrice et aussi
parceque
dans certaines langues, la traductrice peut vouloir inverser l'ordre de
object1
et object2
, il est nécessaire de changer ceci pour
utiliser
une chaîne formatée :
sprintf (s, "Replace %s with %s?", object1, object2); |
Un cas similaire est la concaténation des chaînes à la compilation.
Le fichier ISO C 99 à en-tête <inttypes.h
contient une
macro PRId64
qui peut être utilisée comme une directive de
formatage pour écrire un entier ‘int64_t’ avec printf
.
Il se développe en une chaîne constante, habituellement "d"
ou "ld" ou "lld" ou quelque chose d'approchant, en fonction
de la plateforme. Supposons que vous ayez un code comme celui ci
printf ("La quantité est %0" PRId64 "\n", number); |
Les outils et bibliothèques de gettext
ont un support particulier
pour ces macros <inttypes.h>
. C'est pourquoi vous pouvez simplement
écrire
printf (gettext ("La quantité est %0" PRId64 "\n"), number); |
Le fichier PO contiendra la chaîne "La quantité est %0<PRId64>\n". La
traductrice
donnera une traduction contenant aussi "%0<PRId64>" et au moment de
l'exécution
le résultat de la fonction gettext
contiendra la chaîne constante
appropriée,
"d", "ld" ou "lld".
Ceci fonctionne pour les macros prédéfinie de <inttypes.h>
. Si vous
avez
défini vos propres macros similaires, prenons par exemple ‘MYPRId64’,
qui
qui ne sont pas connues de gettext
, la solution du problème est alors
de
changer le code de la façon suivante :
char buf1[100]; sprintf (buf1, "%0" MYPRId64, number); printf (gettext ("La quantité est %s\n"), buf1); |
Ceci signifie que vous avez mis le code dépendant de la plateforme dans une déclaration et le code d'internationalisation dans une déclaration différente. Notez qu'un tampon de longueur 100 est sûr, car tous les types entiers codés en dur sont limités à 128 bits et pour imprimer un entier codé en 128 bit, on a besoin d'au plus 54 caractères, indépendamment du fait qu'il soit en décimal, octal ou hexadécimal.
Tout ceci s'applique aussi aux autres langages de programmation. Par exemple en Java et C#, les concaténations de chaînes sont utilisées très fréquement, car c'est un opérateur intégré au compilateur. Comme en C, en Java vous changeriez
System.out.println("Replace "+object1+" with "+object2+"?"); |
en une déclaration utilisant une chaîne de formatage :
System.out.println( MessageFormat.format("Replace {0} with {1}?", new Object[] { object1, object2 })); |
De manière identique, en C# vous changeriez
Console.WriteLine("Replace "+object1+" with "+object2+"?"); |
en une déclaration utilisant une chaîne de formatage :
Console.WriteLine( String.Format("Replace {0} with {1}?", object1, object2)); |
Les balises et caractères de contrôle inhabituels ne devraient pas être utilisés dans le chaînes ) traduire. Il y a de fortes chances que Les traductrices ne comprendront pas la signification particulière de ces marquages et caractères de contrôle.
Par exemple, si vous avez une convention que ‘|’ délimite la partie main droite et main gauche d'un élément de l'Interface Utilisateur Graphique, sans commentaires spécifiques, les traductrices ne le comprendront pas la plupart du temps. Il serait peut être mieu que la traductrice traduise la partie main gauche et main droite séparemment.
Un autre exemple est la convention ‘argp’ pour utiliser un seul caractère ‘\v’ (tabulation verticale) pour délimiter deux sections à l'intérieur d'une chaîne. Ceci est défectueux. Certaines traductrices pourront convertir ceci en un simple nouvelle ligne, d'autre en une ligne blanche. Avec certains éditeurs de fichier PO, ceci ne sera même peut être pas facile d'entrer un caractère de contrôle pour une tabulation verticale. Donc vous ne pourrez pas être sûr que la traduction contiendra un caractère ‘\v’ à la position correspondante. La solution est encore une fois de laisser la traductrice traduire deux chaînes séparrées et de les combiner au moment de l'exécution avec le ‘\v’ requis par la convention.
Cependant les balises HTML sont communes et il est problablement acceptable de les utiliser dans les chaînes à traduire. Mais gardez néanmoins en mémoire que les outils GNU gettext ne vérifient pas que les traductions sont des expressions HTML bien formées.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Toutes les chaînes requérant une traduction devraient être balisées dans les
sources C.
Le balisage est fait de telle manière que chaque chaîne à traduire
apparaisse comme le
seul paramètre de quelque fonction ou macro du pré-processeur. Il n'y a que
quelques
fonctions ou macros qui sont concernées par les traduction et leurs noms
sont faits pour
être des clés de balisage. Le balisage est attaché aux chaînes elles-mêmes
plutôt qu'Ã
ce que nous en faisons avec. Cette approche a plusieures utilités. Un
exemple évident
est un message d'erreur produit par formatage. La chaîne de format a besoin
d'une
traduction comme les quelques chaînes qui sont insérées par la spécification
‘%s’
dans le format alors que le résultat de sprintf
peut avoir tellement
d'instances
différentes qu'il n'est pratiquement impossible de les lister toutes, disons
dans quelque
routine ‘error_string_out()’.
Cette opération de balisage a deux objectifs. Le premier objectif du balisage est de déclencher la récupération de la traduction au moment de l'exécution. Le mot clé peut être résolu en une routine capable de retourner dynamiquement la traduction correcte, autant que possible ou autant que voulu, pour la chaîne en paramètre. La plupart des chaînes à rendre dans l'idiome national sont trouvées dans les positions d'exécution, c'est-à -dire attachées à des variables ou données comme des paramètres aux fonctions. Mais ce n'est pas un usage universel et quelques chaînes à traduire apparaissent dans des initialisations structurées. @xref{cas spéciaux}.
Le second objectif de l'opération de balisage est d'aider xgettext
Ã
extraire
correctement toutes les chaînes à traduire, quand il parcourt un ensemble de
sources
de programmes et produit des fichiers PO modèles.
Le mot clé canonique pour baliser une chaîne à traduire est ‘gettext’,
ceci
donne d'ailleurs son nom au progiciel GNU gettext
en son entier. Pour
les
progiciels ne faisant qu'une utilisation légère du mot clé ‘gettex’,
macro
et fonction, il est facilement utilisé tel que. Cependant, pour les
progiciels
utilisant l'interface gettext
de façon plus intensive, il est souvent
plus
pratique de donner au mot clé principal un nom plus court et moins
voyant. En fait,
le mot clé peut apparaître dans beaucoup de chaîne à travers le progiciel et
les
programmeurs ne veulent pas habituellement ni n'ont besoin que leur
programme source
les rappelle énergiquement à chaque fois qu'ils ont internationalisés. De
plus un
long mot clé à le désavantage d'utiliser plus d'espace, forçant un plus gros
travail
d'indentation sur les sources de ceux qui essaient de les garder entre 79 et
80
colonnes.
Beaucoup de progiciel utilisent ‘_’ (un simple soulignement) comme mot
clé
et écrive ‘_("chaîne à traduire")’ à la place de
‘gettext ("chaîne à traduire")’. De plus, la règle de programmation des
standards GNU voulant qu'il y ait un espace entre le mot clé et la
parenthèse
ouvrante est oubliée en pratique, pour cet usage particulier. Ainsi l'entête
textuel pour les chaînes à traduire est réduit à seulement trois
caractères :
le soulignement et les deux parenthèses. Cependant, même si GNU
gettext
utilise cette convention en interne, il ne l'offre pas officiellement. Le
mot clé
véritable et authentique est vraiment en réalité ‘gettext’. Il est
assez facile pour ceux qui veulent utiliser ‘_’ Ã la place de
‘gettext’
de déclarer :
#include <libintl.h> #define _(String) gettext (String) |
à la place de n'utiliser que ‘#include <libintl.h>’.
Les balises ‘gettext’ et ‘_’ prennent les chaînes à traduire
comme seul argument. Il est aussi possible de définir des fonctions de
balisage qui les prennent comme argument à d'autres positions. Il est
même possible de faire dépendre la position de l'argument balisé du
nombre total d'arguments de l'appel de la focntion ; ceci est utile en
C++. Tout ceci est obtenu en utilisant l'option gettext
‘--keyword’.
Notez aussi que les longues chaînes peuvent être coupées à travers les
lignes,
en multiples morceaux de chaînes adjacents. La concaténation automatique de
chaîne
est faite au moment de la compilation en accord avec l'ISO C ou l'ISO
C++
; xgettext
permet aussi cette syntaxe.
Plus tard, la maintenant est relativement simple. Si, en tant que programmeur, vous ajoutez ou modifiez une chaîne, il faudra vous poser la question si le nouvelle chaîne, ou la chaîne modifiée, requiert une traduction et l'inclure avec ‘_()’, si vous pensez qu'elle doit être traduite. Par exemple, ‘"%s"’ est un exemple de chaîne de requérant pas de traduction. Mais ‘"%s: %d"’ requiert une traduction, car en français, à la différence de l'anglais, l'usage veut qu'on place un espace avant les deux points.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Dans le mode PO, un jeu de fonctionalités est prévu plus pour le programmeur que pour la traductrice et lui permet de baliser interactivement dans un jeu de programmes sources les chaînes qui doivent être traduites et celles qui ne le doivent pas. Même si c'est une tâche plutôt simple pour un programmeur que de trouver et de baliser de telles chaînes par d'autres moyens en utilisant un éditeur de son choix, le mode PO rend cette tâche plus confortable. De plus, ceci donne aux traductrices qui se sentent un peu programmeur ou au programmeurs qui se sentent un peu traductrice, un outil qui leur permet de travailler à baliser les chaînes à traduire dans les sources du programme, tout en produisant simultanément un jeu de traduction dans une langue, pour que le progiciel soit internationalisé.
programmes sources visé par les commandes du mode PO décrites ici, devrait avoir une table de balises construites pour votre projet, avant d'utiliser ces commandes de fichier PO. C'est facile à faire. Dans n'importe quel fenêtre de Shell, changer le répertoire pour aller à la racine de votre projet et exécuter une commande ressemblant à :
etags src/*.[hc] lib/*.[hc] |
en présumant ici que vous vouliez traiter tous les fichiers ‘.h’ et ‘.c’ depuis les répertoires ‘src/’ et ‘lib/’. Cette commande explorera tous les dits fichiers et créera un fichier ‘TAGS’ dans votre répertoire racine, résumant à peu prêt les contenus en utilisant un fichier avec un format particulier qu'Emacs peut comprendre.
progiciels
qui suivent les standards de codage GNU, il existe un fichier make ayant
pour
objectif tags
ou TAGS
, qui construit le fichier de balises
dans
tous les répertoires et pour tous les fichiers contenant du code source.
Une fois que votre fichier ‘TAGS’ est prêt, les commandes suivantes aident le programmeur à baliser les chaînes traduisibles dans son jeu de sources. Mais ces commandes sont nécessairement dirigées depuis une fenêtre d'un fichier PO et il est probable que vous n'ayez même pas encore un tel fichier PO. Ceci n'est pas un problème du tout, car vous pouvez en ouvrir un nouveau en toute sécurité, un fichier PO vide, principalement pour utiliser ces commandes. Ce fichier PO vide va se remplir lentement en même temps que vous marquez des chaînes comme chaînes à traduire dans les sources de votre programme.
@efindex, , commande mode PO Recherche à travers les sources du
programme une
chaîne, qui ressemble à une chaîne pouvant être traduite
(po-tags-search
).
‘_()’ (po-mark-translatable
).
mot clé pris dans les mots clés possibles. Cette commande avec un préfixe
permet
une certaine gestion de ces mots clés (po-select-mark-and-mark
).
(po-tags-search
)
cherche les prochaines occurrences d'un chaîne qui ressemble à une chaîne
pouvant être
traduite et affiche le programme source dans une autre fenêtre Emacs,
positionné de
façon à ce que la chaîne soit proche du bord supérieur de cette autre
fenêtre. Si la
chaîne est trop longue pour tenir entièrement dans cette fenêtre, elle est
positionnée
de façon à ce que seule sa fin soit montrée. Dans tous les cas, le curseur
est laissé
dans la fenêtre du fichier PO. Si la chaîne montrée serait mieux présentée
différement
dans d'autres langues nationales, vous pourriez le marquer en utilisant
M-, ou
M-.. Autrement vous pouvez tout autant l'ignorer et passez à la
prochaine chaîne
en répétant juste la commande ,.
Une chaîne est une chaîne bonne à traduire si elle contient une séquence de trois lettres ou plus. Un chaîne contenant au plus deux lettres dans une rangée sera considérée comme une chaîne à traduire s'il elle a plus de lettres que de non-lettres. La commande ne considère pas les chaînes qui ne contiennent pas de lettres, ou seulement des lettres isolées. Elle ne considère pas non plus les chaînes à l'intérieur des commentaires ou les chaînes déjà balisée avec quelques mots clés que le mode PO connait (voir ci dessous).
Si vous n'avez jamais donné à Emacs quelques fichiers ‘TAGS’ à utiliser, la commande demandera que vous en spécifiez un dans la mini zone tampon, la première fois que vous utiliserez la commande. Vous pourrez changer votre fichier ‘TAGS’ plus tard en utilisant la commande habituelle d'Emacs M-x visit-tags-table, qui vous demandera de nommer le fichier ‘TAGS’ précis que vous voulez utiliser. Voir (Emacs)balises section `tables de balises' dans l'éditeur Emacs.
Chaque fois que vous utilisez la commande ,, la recherche démarre depuis là où vous l'aviez arrêté la fois précédente et elle parcourt tous les sources du programme, obéissant au fichier ‘TAGS’ jusqu'à ce que tous les sources aient été traités. Cependant en donnant un argument en préfixe à la commande (C-u ,), vous pouvez demander que la recherche complètement de nouveau depuis le premier programme source ; mais dans ce cas, les chaînes que vous avez récemment balisées comme des chaînes à traduire seront passées automatiquement.
L'utilisation de la commande , ne vous empèche pas d'utiliser les
autres
commandes de balisage d'Emacs. Par exemple, les commandes habituelles
tags-search
ou tags-query-replace
peuvent être utilisées sans
interrompre la séquence de recherche indépendante de la commande {,.
Cependant, comme implémenté, la commande initiale , (ou la
commande
, qui est utilisée avec un préfixe) peut aussi ré-intialiser la
recherche
habituelle d'Emacs au premier fichier de balises, cette ré-intialisation
peut
être non désirée.
La commande M-, (po-mark-translatable
) marquera les chaînes
récemment trouvée avec le mot clé ‘_’. La commande M-.
(po-select-mark-and-mark
) vous demandera de taper un mot-clé
depuis la mini zone tampon et d'utiliser ce mot-clé pour marquer la
chaîne. Les deux commandes créeront automatiquement un nouveau fichier PO
des entrées non traduites pour les chaînes qui auront été marquée et en fera
l'entrée courante (vous rendant facile le traitement immédiat de sa
traduction,
si vous vous sentez d'attaque de l'entreprendre immédiatement). Il est
posssible
que les modifications faites au programme source par M-, ou M-.
rendent les lignes du sources plus longues que 80 colonnes, vous forçant de
les
ré-indenter et de les couper différement. Vous pouvez utiliser la commande
O
du mode PO ou n'importe quelle autre commande affectant les fenêtres dans
Emacs,
pour couper dans la fenêtre du programme source et d'y faire tous les
ajustements
nécessaires. Vous devrez utiliser quelques commandes habituelles d'Emacs
pour
renvoyer le curseur à la fenêtre du fichier PO, disons si vous voulez
utiliser
la commande , pour la prochaine chaîne.
La commande M-. a quelques fonctionalités intégrées de rapididé, de
façon
à ce que vous n'ayez pas à taper explicitement tous les mots-clés à chaque
fois.
La première de ces fonctionalités de rapidité est que que les mots-clés
préférés vous sont présentés, que vous pouvez accepter en tapant
simplement
<RET> à l'invite de commande. La seconde fonctionalité de rapidité
est
que vous pouvez tapez les préfixes non-amiguës des mots-clés que vous voulez
vraiment
et la commande les complètera automatiquement pour vous. Ceci veut aussi
dire
que le mode PO doit connaître
tous les mots-clés possibles et qu'il
n'accèptera
pas les mots clés tapés incorrectement.
Si vous répondez ? à la demande de mot-clé, la commande vous donnera une liste de tous les mots-clés connus, dans laquelle vous pourrez choisir. Quand la commande est préfixée par un argument (C-u M-.), ceci empêche la mise à jour de tout programme source ou de zone tampon PO et ne fait qu'une gestion de mot-clé à la place. Dans ce cas, la commande demande un mot-clé, écrit en entier, qui devient un nouveau mot-clé permis pour les futures utilisaton de la commande M-.. De plus, ce nouveau mot-clé devient automatiquement le mot-clé préféré pour les prochaines commandes. En tapant un mot-clé déjà connu en réponse à C-u M-., on change juste le mot-clé préféré et rien de plus.
Quand les chaînes sont scruttées, tous les mots clés connus par M-. sont reconnu par la commande , et les chaînes déjà marquées par n'importe lequel de ces mots-clés connus sont automatiquement sautées. Si beaucoup de fichier PO sont ouverts simultanément, chacun a son jeu de mots-clés connus indépendant. Il n'y a —pour le moment— aucune disposition dans le mode PO pour effacer un mot-clé connu. Vous devez quitter le fichier (par exemple en utilisant q) et le ré-ouvrir tout frais. Quand un fichier PO est juste ouvert dans une fenêtre Emacs, les seuls mots-clés connus sont ‘gettext’ et ‘_’ et ‘gettext’ est préféré pour la commande M-.. En fait, il n'est pas utile de préférer ‘_’, car celui ci est déjà intégré dans la commande M-,.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Dans les programmes C, les chaînes sont souvent utilisées à l'intérieur
d'appels
de fonctions de la famille des printf
. La spécialité de ces chaînes
formattées
est qu'elles contiennent des spécifications de format introduites par
%.
Suppsons que nous ayons le code
printf (gettext ("String `%s' has %d characters\n"), s, strlen (s)); |
Une traduction possible en allemand pour la chaîne ci dessus serait:
"%d Zeichen lang ist die Zeichenkette `%s'" |
Un programmeur C, même s'il ne parle pas allemand, reconnaitra qu'il y a
quelque chose qui ne va pas ici. L'ordre des deux spécifications de formats
précédant est changée, mais bien sûr les arguments dans la fonctions
printf
ne le sont pas. Ceci conduira probablement à des problèmes car maintenant la
longueur de la chaîne est considérée comme l'adresse.
Pour prévenir les erreurs au moment de l'exécution causée par les
traductions,
l'outil msgfmt
peut vérifier statiquement si les arguments dans la
chaîne
originale et sa traduction correspondent en genre et en nombre. Si ce n'est
pas le cas
et que l'option ‘-c’ a été passée à msgfmt
, msgfmt
donnera une
erreur et refusera de produire le fichier MO. L'utilisation avisée de
‘msgfmt -c’ interceptera l'erreur, de façon à ce qu'elle ne cause pas
de
problème au moment de l'exécution.
Si l'ordre les mots dans la traducitons allemande ci dessus avait été correcte, on aurait eu à écrire
"%2$d Zeichen lang ist die Zeichenkette `%1$s'" |
Les routines dans msgfmt
connaissent ce genre de notation spéciale.
Parceque toutes les chaînes dans un programme ne sont pas des châînes
formattées,
il n'est pas utile pour msgfmt
de tester toutes le chaînes dans un
fichier
‘.po’. Ceci peut être la source de problème car la chaîne peut contenir
ce quu
peut ressembler à une spécification de format, sans que la chaîne ne soit
utilisée
dans printf
.
C'est pourquoi xgettext
ajoute une balise spéciale pour ces messages
qu'il croit
être des chaînes formattées. Il n'y a pas de règle absolue pour cela,
seulement une
heuristique. Dans le fichier ‘.po’, l'entrée est marquée en utilisant
le
marquer de c-format
dans la ligne commentée #,
(@pxref{fichiers PO}).
Le lecteur attentif pourrait maintenant dire, que c'est ceci peut encore
causer des problèmes. L'heuristique peut l'interpréter de manière
erronnée. Ceci est vrai et c'est pourquoi xgettext
connait un
certain type de commentaire qui laisse le programmeur prendre la
décision. Si programme xgettext
trouve un commentaire contenant les
mots xgettext:c-format
sur la même ligne que le mot-clé
gettext
, ou la ligne la précédant immédiatement, il marquera la
chaîne dans tous les cas avec la marque c-format
. Cette sorte de
commentaire devrait être utilisé quand xgettext
ne reconnait pas la
chaîne comme une chaîne formattée, mais qu'elle en ai vraiment une et que
cela devrait être vérifié.
Cette situation arrive assez souvent. La fonction printf
est souvent
appelée avec des chaînes qui ne contiennent pas de spécification de format.
Bien sûr on devrait normalement utiliser fputs
, mais ceci arrive
tout de même. Dans ce cas xgettext
ne reconnait pas la chaîne comme
une chaîne formattée, mais qu'arrive-t-il si la traduction introduit une
spécification de format valide. ? La fonction printf
cherchera Ã
accéder à l'un des paramètres, mais aucun n'existe parceque le code original
ne passe aucun paramètre.
xgettext
pourrait bien sûr prendre une mauvaise décision dans le sens
contraire, i.e. une chaîne marquée comme une chaîne formattée ne serait
pas
en réalité une chaîne formatté. Dans ce cas le programme msgfmt
pourrait
donner trop d'avertissements et empêcherait de traduire le fichier
‘.po’.
La méthode pour empêcher cette mauvaise décision est identique à celle
utilisée
ci dessus, seul le commentaire à utiliser doit contenir la chaîne
xgettext:no-c-format
.
Si une chaîne est marquée avec c-format
et que ce n'est pas correct,
l'utilisateur peut retrouver qui est responsable de cette décision.
Voir Invocation le programme msginit
pour voir comme l'option --debug
peut être utilisé pour régler ce problème.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Le lecteur attentif pourrait maintenant objecter qu'il n'est pas toujours
possible de marquer les chaînes à traduire avec gettext
ou quelque
chose d'analogue. Considérez le cas suivant :
{ static const char *messages[] = { "some very meaningful message", "and another one" }; const char *string; … string = index > 1 ? "a default message" : messages[index]; fputs (string); … } |
Alors que ce n'est pas un problème de marquer la chaîne "a default
message"
,
il n'est pas possible de marquer les initialisateurs de chaines pour
messages
.
Que faut-il faire ? Nous devons réaliser deux tâches. Tout d'abord, nous
devons
marquer les chaînes de façon à ce que le programme xgettext
(voir la section Invocation le programme msginit
puisse les trouver et ensuite nous devons
traduire
les chaînes au moment de l'exécution avant de les imprimer.
La première tâche peut être réalisée en créant un nouveau mot clé, qui nomme une no-op. Pour la seconde, nous devons marquer tous les points d'accès à une chaîne depuis le vecteur. Donc une solution pourrait ressembler à ceci :
#define gettext_noop(String) String { static const char *messages[] = { gettext_noop ("some very meaningful message"), gettext_noop ("and another one") }; const char *string; … string = index > 1 ? gettext ("a default message") : gettext " (messages[index]); fputs (string); … } |
Vous devez vous convaincre que la chaîne, qui est écrite par fputs
est
traduite dans tous les cas. La façon d'obtenir que xgettext
connaisse
le mot-clé additionel gettext_noop
est expliqué dans
Invocation le programme msginit
.
Ce qui est décrit ci dessus n'est bien sûr pas la seule solution. Vous auriez pu aussi vous en sortir avec la façon suivante :
#define gettext_noop(String) String { static const char *messages[] = { gettext_noop ("some very meaningful message", gettext_noop ("and another one") }; const char *string; … string = index > 1 ? gettext_noop ("a default message") : messages[index]; fputs (gettext (string)); … } |
Mais ceci a un inconvénient. Le programmeur doit prendre garde d'utiliser
gettext_noop
pour la chaîne "a default message"
. L'utilisation
de gettext
pourrait avoir dans des cas rares des résultats
imprévisibles.
Un avantage est que vous n'avez pas besoin de faire du contrôle d'analyse de flux pour vous assurer que la sortie est traduite dans tous les cas. Mais cette analyse est générallement pas très difficile. Si ça devait l'être, dans ce cas vous pourrez utiliser cette seconde méthode en toute situation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Le code a parfois des bogues, mais les traductions ont aussi parfois des bogues. L'utilisateur doit pouvoir les rapporter. Rapporter les bogues de traduction au programmeur ou au mainteneur d'un progiciel n'est pas très utile, parceque le mainteneur ne doit jamais changer une traduction, excepté pour le compte de la traductrice. C'est pourquoi les bogues de la traduction doivent être rapportés aux traductrices.
Voici un moyen d'organiser ceci, de façon à ce que le mainteneur n'ait pas besoin de faire suivre les rapports de bogues de traduction et même pas de garder une liste des adresses des traductrices ou de leurs équipes de traduction.
Tous les programmes ont une place où ils montrent les adresses de rapport de bogues. Pour les programmes GNU, c'est le code qui gère l'option “–help”, typiquement dans une fonction appelée “usage”. Instruisez les traductrices d'ajouter à cette place leur propre adresse de rapport de bogue. Par exemple, si le code a une instruction
printf (_("Report bugs to <%s>.\n"), PACKAGE_BUGREPORT); |
vous pouvez ajouter quelques instructions de traductrice comme ceci :
/* TRANSLATORS: The placeholder indicates the bug-reporting address for this package. Please add _another line_ saying "Report translation bugs to <...>\n" with the address for translation bugs (typically your translation team's web or email address). */ printf (_(\"Report bugs to <%s>.\n"), PACKAGE_BUGREPORT); |
Ceci sera extrait par ‘xgettext’, conduisant à un fichier .pot qui contiendra ceci :
#. TRANSLATORS: The placeholder indicates the bug-reporting address #. for this package. Please add _another line_ saying #. "Report translation bugs to <...>\n" with the address for " translation #. bugs (typically your translation team's web or email address). #: src/hello.c:178 #, c-format msgid "Report bugs to <%s>.\n" msgstr "" |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Est-ce que le nom des personnes, villes, localisation etc ... doit être marqué pour les traductions ou non ? Les personnes, qui ne connaissent que les langues, qui peuvent s'écrire avec les lettres latines (l'anglais, l'espagnol, le français, l'allemant etc@dots) sont tentées de dire “non”, car les noms souvent ne changent pas quand ils sont transportés entre ces langues. Cependant, en général quand on traduit d'une écriture à une autre, les noms sont aussi traduit, habituellement phonétiquement ou par translitération. Par exemple, les noms russes ou grec sont convertis dans l'alphabet latin quand ils sont traduits en anglais et les noms anglais ou français sont convertis en écriture katana quand ils sont traduits en japonais. Ceci est nécessaire car les locuteurs de la langue cible ne savent pas, en général, lire l'écriture dans laquelle le nom était écrit à l'origine.
En tant que programmeur, vous devriez donc vous assurer que les noms soient marqués pour traduction avec un commentaire spécial disant aux traductrices que c'est un nom propre et comment le prononcer. Comme ceci :
printf (_("Written by %s.\n"), /* TRANSLATORS: This is a proper name. See the gettext manual, section Names. Note this is actually a non-ASCII name: The first name is (with Unicode escapes) "Fran\u00e7ois" or (with HTML entities) "François". Pronunciation is like "fraa-swa pee-nar". */ _("Francois Pinard")); |
Comme traductrice, vous devriez faire attention quand vous traduisez les noms, car il est frustrant pour les personnes de voir leur noms mutilés ou distordus. Si votre langue utiliser l'alphabet latin, tout ce dont bous avez besoin est de reproduire le nom aussi parfaitement que possible avec les jeux de caractères habituels de votre langue. Dans ce cas particulier, ceci signifie de donner une traduction contenant le caractère c-cédille. Si votre langue utilise une autre écriture et les personnes parlant cette langue ne lisent pas habituellement les mots latin, cela signifie une translitération ; mais vous devriez toujours donner, entre paranthèses, l'écriture original du nom —pour l'usage des personnes sachant lire les écritures latines. Voici une exemple, utilisant le grec comme langue cible :
#. This is a proper name. See the gettext #. manual, section Names. Note this is actually a non-ASCII #. name: The first name is (with Unicode escapes) #. "Fran\u00e7ois" or (with HTML entities) "François". #. Pronunciation is like "fraa-swa pee-nar". msgid "Francois Pinard" msgstr "\phi\rho\alpha\sigma\omicron\alpha " \pi\iota\nu\alpha\rho" " (Francois Pinard)" |
Parceque la traduction des noms est un sujet sensible, c'est une bonne idée de tester votre traduction avant de la soumettre.
Le projet de traduction http://sourceforge.net/projects/translation a défini un fichier POT et un domaine de traduction qui est consitués des noms des auteurs de programmes, avec de meilleures fonctionalités pour les traductrices que celles présentées ici. Nommément, il y a le nom original écrit directement en unicode (plutôt qu'avec des échappements unicode ou des entités HTML) et la prononciation est notée en utilisant l'alphabet phonétique international (voir http://www.wikipedia.org/wiki/International_Phonetic_Alphabet).
Cependant, nous ne recommendons pas cette approche pour tous les fichier POT dans tous les progiciels, car cela forcerait les traductrices d'utiliser des fichiers PO avec l'encodage UTF-8, qui est — dans l'état actuel du logiciel (en 2003) — un harrassement majeur pur les traductrices utilisant GNU Emacs ou XEmacs avec po-mode.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Quand vous êtes en train de préparer une bibliothèque, pas un programme,
pour l'utilisation
de gettext
, seulement quelques détails sont différents. Ici nous
assumons
que la bibliothèque a un domaine de traduction et un fichier POT pour elle
même. (si elle
utilise le domaine de traduction et le fichier POT du programme principal,
alors la section
précédentes s'applique sans changement.)
setlocale (LC_ALL, "")
.
C'est la responsabilité du programmeur principal, que de définir la
localisation.
La documentation de la bibliothèque devrait en fait le mentioner, de façon Ã
ce
que les développeurs de programmes utilisant cette bibliothèque en soient
conscient.
textdomain (PACKAGE)
, car
cela
interfèrerait avec le domaine de texte défini par le programme principal.
setlocale (LC_ALL, ""); bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE); |
Pour une bibliothèque il est réduit Ã
bindtextdomain (PACKAGE, LOCALEDIR); |
Si l'API (-ndt Interface de Programmation de l'Application) de votre
bibliothèque
n'a pas déjà une fonction d'initialisation, vous aurez besoin d'en créer
une,
contenant au moins l'invocation de bindtextdomain
. Cependant vous
n'aurez
pas besoin habituellement d'exporter et de documenter cette fonction
d'initialisation :
il est suffisant que tous les points d'entrée de la bibliothèque appèlent la
fonction
d'initialisation si elle n'a pas été appelée avant. L'idiome type pour
parvenir Ã
ceci est une variable booléenne statique, qui indique si la fonction
d'initialisation
a été appelée, comme ceci :
static bool libfoo_initialized; static void libfoo_initialize (void) { bindtextdomain (PACKAGE, LOCALEDIR); libfoo_initialized = true; } /* This function is part of the exported API. */ struct foo * create_foo (...) { /* Must ensure the initialization is performed. */ if (!libfoo_initialized) libfoo_initialize (); ... } /* This function is part of the exported API. The argument must be non-NULL and have been created through create_foo(). */ int foo_refcount (struct foo *argument) { /* No need to invoke the initialization function here, because create_foo() must already have been called before. */ ... } |
#include <libintl.h> #define _(String) gettext (String) |
pour un programme. Pour une bibliothèque, qui a son propre domaine de traduction, ceci ce lit comme ceci :
#include <libintl.h> #define _(String) dgettext (PACKAGE, String) |
En d'autres termes, dgettext
est utilisé à la place de
gettext
.
De manière similaire, la fonction dngettext
devrait être utilisée Ã
la
place de la fonction ngettext
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Après avoir préparer les sources, le programmeur crée un fichier PO modèle
(-ndt template file). Cette section explique comment utiliser xgettex
dans ce but.
xgettext
crée un fichier nommé ‘nomdomaine.po’. Vous
devriez
le renommer en ‘nomdomaine.pot’. (Pourquoi xgettext
ne le
crée pas tout de suite sour le nom ‘nomdomaine.pot’ ? La réponse
est : pour des raisons historiques. Quand xgettext
a été spécifié, la
distinction entre les fichiers PO et les modèle (-ndt template) de fichier
PO
était floue et le suffixe ‘.pot’ n'était pas utilisé à cette époque.)
5.1 Invocation le programme msginit |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msginit
xgettext [option] [inputfile] … |
Le programme xgettext
extrait les chaînes à traduire depuis un
fichier entrée donné.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input files.
Read the names of the input files from file instead of getting them from the command line.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Use ‘name.po’ for output (instead of ‘messages.po’).
Write output to specified file (instead of ‘name.po’ or ‘messages.po’).
Output files will be placed in directory dir.
Si le fichier de sortie est ‘-’ ou ‘/dev/stdout’, la sortie est écrite sur la sortie standard.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Spécifie le langage du fichier d'entrée. Les langages supportés sont
C
, C++
, ObjectiveC
, PO
, Python
,
Lisp
, EmacsLisp
, librep
, Scheme
,
Smalltalk
, Java
, JavaProperties
, C#
,
awk
, YCP
, Tcl
, Perl
, PHP
,
GCC-source
, NXStringTable
, RST
, Glade
.
This is a shorthand for --language=C++
.
Par défaut, le langage est déduit de l'extension du nom du fichier.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specifies the encoding of the input files. This option is needed only if some untranslated message strings or their corresponding comments contain non-ASCII characters. Note that Tcl and Glade input files are always assumed to be in UTF-8, regardless of this option.
Par défaut les fichiers d'entrée sont considérés avoir été codés en ASCII.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Join messages with existing file.
Entries from file are not extracted. file should be a PO or POT file.
Place comment block with tag (or those preceding keyword lines) in output file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Extract all strings.
Cette option fonctionne avec la plupart des langages, nommément C, C++, ObjectiveC, Shell, Python, Lisp, EmacsLisp, librep, Java, C#, awk, Tcl, Perl, PHP, GCC-source, Glade.
Additional keyword to be looked for (without keywordspec means not to use default keywords).
If keywordspec is a C identifier id, xgettext
looks for
strings in the first argument of each call to the function or macro
id. If keywordspec is of the form ‘id:argnum’,
xgettext
looks for strings in the argnumth argument of the
call. If keywordspec is of the form
‘id:argnum1,argnum2’, xgettext
looks for
strings in the argnum1st argument and in the argnum2nd argument
of the call, and treats them as singular/plural variants for a message with
plural handling. Also, if keywordspec is of the form
‘id:contextargnumc,argnum’ or
‘id:argnum,contextargnumc’, xgettext
treats
strings in the contextargnumth argument as a context specifier. And,
as a special-purpose support for GNOME, if keywordspec is of the form
‘id:argnumg’, xgettext
recognizes the argnumth
argument as a string with context, using the GNOME glib
syntax
‘"msgctxt|msgid"’.
Furthermore, if keywordspec is of the form
‘id:…,totalnumargst’, xgettext
recognizes this
argument specification only if the number of actual arguments is equal to
totalnumargs. This is useful for disambiguating overloaded function
calls in C++.
Finally, if keywordspec is of the form
‘id:argnum...,"xcomment"’, xgettext
, when
extracting a message from the specified argument strings, adds an extracted
comment xcomment to the message. Note that when used through a normal
shell command line, the double-quotes around the xcomment need to be
escaped.
Cette option fonctionne avec la plupart des langages, nommément C, C++, ObjectiveC, Shell, Python, Lisp, EmacsLisp, librep, Java, C#, awk, Tcl, Perl, PHP, GCC-source, Glade.
The default keyword specifications, which are always looked for if not explicitly disabled, are language dependent. They are:
gettext
, dgettext:2
,
dcgettext:2
, ngettext:1,2
, dngettext:2,3
,
dcngettext:2,3
, gettext_noop
, and pgettext:1c,2
,
dpgettext:2c,3
, dcpgettext:2c,3
, npgettext:1c,2,3
,
dnpgettext:2c,3,4
, dcnpgettext:2c,3,4
.
NSLocalizedString
, _
,
NSLocalizedStaticString
, __
.
gettext
, ngettext:1,2
, eval_gettext
,
eval_ngettext:1,2
.
gettext
, ugettext
, dgettext:2
,
ngettext:1,2
, ungettext:1,2
, dngettext:2,3
, _
.
gettext
, ngettext:1,2
, gettext-noop
.
_
.
_
.
gettext
, ngettext:1,2
, gettext-noop
.
GettextResource.gettext:2
,
GettextResource.ngettext:2,3
, GettextResource.pgettext:2c,3
,
GettextResource.npgettext:2c,3,4
, gettext
,
ngettext:1,2
, pgettext:1c,2
, npgettext:1c,2,3
,
getString
.
GetString
, GetPluralString:1,2
,
GetParticularString:1c,2
, GetParticularPluralString:1c,2,3
.
dcgettext
, dcngettext:1,2
.
::msgcat::mc
.
gettext
, %gettext
, $gettext
,
dgettext:2
, dcgettext:2
, ngettext:1,2
,
dngettext:2,3
, dcngettext:2,3
, gettext_noop
.
_
, gettext
, dgettext:2
, dcgettext:2
,
ngettext:1,2
, dngettext:2,3
, dcngettext:2,3
.
label
, title
, text
, format
,
copyright
, comments
, preview_text
, tooltip
.
To disable the default keyword specifications, the option ‘-k’ or ‘--keyword’ or ‘--keyword=’, without a keywordspec, can be used.
Specifies additional flags for strings occurring as part of the argth
argument of the function word. The possible flags are the possible
format string indicators, such as ‘c-format’, and their negations, such
as ‘no-c-format’, possibly prefixed with ‘pass-’.
The meaning of --flag=function:arg:lang-format
is
that in language lang, the specified function expects as
argth argument a format string. (For those of you familiar with GCC
function attributes, --flag=function:arg:c-format
is
roughly equivalent to the declaration ‘__attribute__ ((__format__
(__printf__, arg, ...)))’ attached to function in a C source
file.) For example, if you use the ‘error’ function from GNU libc, you
can specify its behaviour through --flag=error:3:c-format
. The
effect of this specification is that xgettext
will mark as format
strings all gettext
invocations that occur as argth argument of
function. This is useful when such strings contain no format string
directives: together with the checks done by ‘msgfmt -c’ it will ensure
that translators cannot accidentally use format string directives that would
lead to a crash at runtime.
The meaning of --flag=function:arg:pass-lang-format
is that in language lang, if the function call occurs in a
position that must yield a format string, then its argth argument must
yield a format string of the same type as well. (If you know GCC function
attributes, the --flag=function:arg:pass-c-format
option
is roughly equivalent to the declaration ‘__attribute__
((__format_arg__ (arg)))’ attached to function in a C source
file.) For example, if you use the ‘_’ shortcut for the gettext
function, you should use --flag=_:1:pass-c-format
. The effect of
this specification is that xgettext
will propagate a format string
requirement for a _("string")
call to its first argument, the literal
"string"
, and thus mark it as a format string. This is useful when
such strings contain no format string directives: together with the checks
done by ‘msgfmt -c’ it will ensure that translators cannot accidentally
use format string directives that would lead to a crash at runtime.
This
option has an effect with most languages, namely C, C++, ObjectiveC, Shell,
Python, Lisp, EmacsLisp, librep, Scheme, Java, C#, awk, YCP, Tcl, Perl, PHP,
GCC-source.
Understand ANSI C trigraphs for input.
This option has an effect only
with the languages C, C++, ObjectiveC.
Recognize Qt format strings.
This option has an effect only with the
language C++.
Recognize KDE 4 format strings.
This option has an effect only with the
language C++.
Recognize Boost format strings.
This option has an effect only with the
language C++.
Use the flags c-format
and possible-c-format
to show who was
responsible for marking a message as a format string. The latter form is
used if the xgettext
program decided, the format form is used if the
programmer prescribed it.
By default only the c-format
form is used. The translator should not
have to care about these details.
This implementation of xgettext
is able to process a few awkward
cases, like strings in preprocessor macros, ANSI concatenation of adjacent
strings, and escaped end of lines for continued strings.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if no message is defined.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines. Note that using this option makes it harder for technically skilled translators to understand each message's context.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Don't write header with ‘msgid ""’ entry.
This is useful for testing purposes because it eliminates a source of
variance for generated .gmo
files. With --omit-header
, two
invocations of xgettext
on the same files with the same options at
different times are guaranteed to produce the same results.
Note that using this option will lead to an error if the resulting file would not entirely be in ASCII.
Set the copyright holder in the output. string should be the copyright holder of the surrounding package. (Note that the msgstr strings, extracted from the package's sources, belong to the copyright holder of the package.) Translators are expected to transfer or disclaim the copyright for their translations, so that package maintainers can distribute them without legal risk. If string is empty, the output files are marked as being in the public domain; in this case, the translators are expected to disclaim their copyright, again so that package maintainers can distribute them without legal risk.
The default value for string is the Free Software Foundation, Inc.,
simply because xgettext
was first used in the GNU project.
Omit FSF copyright in output. This option is equivalent to ‘--copyright-holder=''’. It can be useful for packages outside the GNU project that want their translations to be in the public domain.
Set the package name in the header of the output.
Set the package version in the header of the output. This option has an effect only if the ‘--package-name’ option is also used.
Set the reporting address for msgid bugs. This is the email address or URL to which the translators shall report bugs in the untranslated strings:
It can be your email address, or a mailing list address where translators can write to without being subscribed, or the URL of a web page through which the translators can contact you.
The default value is empty, which means that translators will be clueless! Don't forget to specify this option.
Use string (or "" if not specified) as prefix for msgstr entries.
Use string (or "" if not specified) as suffix for msgstr entries.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Quand vous débuttez une nouvelle traduction, la traductrice crée un fichier appelé ‘LANG.po’, comme une copie du fichier modèle ‘package.pot’ avec des modifications dans les commentaires initiaux (au début du fichier) et dans les entrées de l'en-tête (la première entrée, près du début du fichier).
La façon la plus simple de faire ceci est d'utilisr le programme ‘msginit’. Par exemple :
$ cd PACKAGE-VERSION $ cd po $ msginit |
La façon alternative de faire ceci est de copier les modifications à la main. Pour ce faire, la traductrice copie ‘package.pot’ dans ‘LANG.po’. Ensuite elle modifie les commentaires initiaux et l'entrée de l'en-tête de ce fichier.
6.1 Invocation du programme msginit | ||
6.2 Remplissage de l'en-tête |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msginit
msginit [option] |
The msginit
program creates a new PO file, initializing the meta
information with values from the user's environment.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input POT file.
If no inputfile is given, the current directory is searched for the POT file. If it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified PO file.
If no output file is given, it depends on the ‘--locale’ option or the user's locale setting. If it is ‘-’, the results are written to standard output.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Set target locale. ll should be a language code, and CC should be a country code. The command ‘locale -a’ can be used to output a list of all installed locales. The default is the user's locale setting.
Declares that the PO file will not have a human translator and is instead automatically generated.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Les commentaire initiaux "SOME DESCRIPTIVE TITLE", "YEAR" et "FIRST AUTHOR <EMAIL@ADDRESS>, YEAR" devrait être remplacé par de l'information raisonnable. Ceci peut être fait dans n'importe éditeur de texte ; si Emacs est utilisé et est placé dans le mode PO automatiquement (parcecqu'il a reconnu l'extension du fichier), vous pouvez le désactiver en tapant M-x fundamental-mode.
La modification de l'entrée de l'en-tête peut déjà être faire en utilisant le mode PO : dans Emacs, tapez "M-x po-mode RET et ensuite RET de nouveau pour démarrer l'édition de l'entrée. Vous devriez remplir les champs suivant.
Ceci est le nom et la version du progiciel (-ndt package). Remplissez le, si
cela
n'a pas déjà été rempli par xgettext
.
Ceci a déjà été rempli par xgettext
. Il contient une adresse de
courriel ou une URL, où vous pouvez rapporter les bogues dans les chaînes
non-traduites :
Ceci a déjà été remplis par xgettext
.
Vous n'avez pas besoin de remplir ceci. Ceci sera rempli par l'éditeur de fichier PO, quand vous sauverez le fichier.
Remplissez votre nom et votre adresse courriel (sans les doubles guillemets).
Remplissez ici le nom anglais de la langue et l'adresse courriel ou du portail de l'équipe de langue dont vous faites partie.
Avant de démarrer une traduction, c'est une bonne idée de prendre contact avec votre équipe de traduction, non seulement pour être sûr que vous ne dupliquerez pas le travail, mais aussi pour coordonner les sujets liguistiques difficiles.
Dans le projet de traduction libre, chaque équipe de traduction a sa propre liste de courriel. La liste mise à jour des équipes peut se trouver sur la portail du projet de traduction libre, http://translationproject.org/, dans l'espace "Teams".
Remplacer ‘CHARSET’ avec l'encodage de caractère utilisé par votre
langue, dans votre localisation ou UTF-8. Ce champs est nécessaire pour
l'opération correcte des programmes msmerge
et msgfmt
, comme
pour les utilisateurs dont l'encodage de caractère local diffère de la
vôtre (voir @ref{conversion de jeu de caractères}).
Vous pouvez obtenir l'encodage de caractères de votre localisaton en lançant la commande en ligne de commande ‘locale charmap’. Si le résultat est ‘C’ ou ‘ANSI_X3.4-1968’, qui est équivalent à ‘ASCII’ (= ‘US-ASCII’), ceci signifie que votre localisation n'est configurée correctement. Dans ce cas, demandez à votre équipe de traduction, quel encodage ils utilisent. ‘ASCII’ n'est pas utilisable pour aucune langue excépté le latin.
Parceque les fichiers PO doivent être portables pour les systèmes
d'exploitation avec moins de fonctionalitée avancée d'internationalisation,
les encodages de caractères qui peuvent être utilisés sont limités à ceux
qui sont supportés à la fois par GNU libc
et GNU libiconv
.
Ces codes sont : ASCII
, ISO-8859-1
, ISO-8859-2
,
ISO-8859-3
, ISO-8859-4
, ISO-8859-5
, ISO-8859-6
,
ISO-8859-7
, ISO-8859-8
, ISO-8859-9
,
ISO-8859-13
, ISO-8859-14
, ISO-8859-15
, KOI8-R
,
KOI8-U
, KOI8-T
, CP850
, CP866
, CP874
,
CP932
, CP949
, CP950
, CP1250
, CP1251
,
CP1252
, CP1253
, CP1254
, CP1255
, CP1256
,
CP1257
, GB2312
, EUC-JP
, EUC-KR
, EUC-TW
,
BIG5
, BIG5-HKSCS
, GBK
, GB18030
,
SHIFT_JIS
, JOHAB
, TIS-620
, VISCII
,
GEORGIAN-PS
, UTF-8
.
Dans le système GNU, les encodages suivant sont fréquemment utilisés pour les langue correspondantes.
ISO-8859-1
pour
afrikaans, albanien, basque, breton, catalan, cornish, danois, néerlandais,
anglais, estonien, faroïen, finnois, français, galican, allemand,
grooenlandais,
islandais, indonesien, irlandais, italien, malais, manx, norvégien, occitan,
portugais, espagnol, suédois, tagalog, ouzbek, walloon,
ISO-8859-2
pour
bosniaque, croatien, tchèque, hongrois, polonais, roumain, serbe, slovaque,
"
slovène,
ISO-8859-3
pour la maltais
ISO-8859-5
pour le macédonie, le serbe,
ISO-8859-6
pour l'arabe,
ISO-8859-7
pour le grec,
ISO-8859-8
pour l'hébreux,
ISO-8859-9
pour le turc,
ISO-8859-13
pour le latvian, lituanien, maori,
ISO-8859-14
pour le galois,
ISO-8859-15
pour
basque, catalan, néerlandais, anglais, finois, français, galicien, allemand,
irlandais, "
italien, portugais, espagnlos, suédois, wallon,
KOI8-R
pour le russe,
KOI8-U
pour l'ukrainien,
KOI8-T
pour le tajik,
CP1251
pour le bulgare, le biélorusse,
GB2312
, GBK
, GB18030
pour l'écriture simplifiée du chinois,
BIG5
, BIG5-HKSCS
pour l'écriture traditionnelle du chinois,
EUC-JP
pour le japonais,
EUC-KR
pour le coréen,
TIS-620
pour le thaïlandais,
GEORGIAN-PS
pour le géorgien,
UTF-8
pour toutes langues, incluant celles listées ci dessus.
Quand des caractères de quotation simples ou des caractères de quotation doubles sont utilisés dans les traductions pour votre langue et que votre encodage local est l'un des jeux de caractères ISO-8859-*, il vaut mieux que vous créeiez votre fichier PO en utilisant l'encodage UTF-8 à la place de votre encodage local. Ceci parceque les caractères réels de quotation peuvent être représentés en UTF-8 (les caractères de simple quotation : U+2018, U+2019, les caractères de quotation double : U+201C, U+201D) alors qu'aucun des jeux de caractères ISO-8859-* ne les a tous. Les utilisateurs de localisation en UTF-8 veront les caractères réels de quotation alors que les utilisateurs des localisation en ISO-8859-* veront l'apostrophe verticalle et la double quotation verticale à la place (parceque c'est que la translitération définira pour les traduire).
Pour entrer ces caractères de quotation sous X11, vous pouvez changer le
l'arrangement
des correspondances des touches de votre clavier en utilisant le programme
xmodmap
.
Les noms X11 des caractères de quotation sont "leftsinglequotemark",
"rightsinglequotemark",
"leftdoublequotemark", "rightdoublequotemark", "singlelowquotemark",
"doublelowquotemark".
Notez que seules les versions récentes de GNU Emacs supportent l'encodage UTF-8 : Emacs 20 avec Mule-UCS et Emacs 21. En janvier 2001, XEmacs ne supportait pas encore l'encodage UTF-8.
Les noms d'encodages des caractères peuvent être écrit soit en majuscules soit en minuscules. Habituellement, les majuscules sont préférées.
Définir ceci en 8bit
.
Ce champ est optionnel. Il est seulement nécessaire si le fichier PO a des formes plurielles. Vous pouvez les trouver en cherchant le mot-clé ‘msgid_plural’. Le format des champs de formes plurielles et décrit dans @ref{formes plurielles}.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
7.1 Invocation du promgramme msgmerge |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgmerge
msgmerge [option] def.po ref.pot |
The msgmerge
program merges two Uniforum style .po files together.
The def.po file is an existing PO file with translations which will be
taken over to the newly created file as long as they still match; comments
will be preserved, but extracted comments and file positions will be
discarded. The ref.pot file is the last created PO file with
up-to-date source references but old translations, or a PO Template file
(generally created by xgettext
); any translations or comments in the
file will be discarded, however dot comments and file positions will be
preserved. Where an exact match cannot be found, fuzzy matching is used to
produce better results.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Translations referring to old sources.
References to the new sources.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
Specify an additional library of message translations. Voir la section Utiliser un compendia de traduction. This option may be specified more than once.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Update def.po. Do nothing if def.po is already up to date.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The result is written back to def.po.
Make a backup of def.po
Override the usual backup suffix.
The version control method may be selected via the --backup
option or
through the VERSION_CONTROL
environment variable. Here are the
values:
Never make backups (even if --backup
is given).
Make numbered backups.
Make numbered backups if numbered backups for this file already exist, otherwise make simple backups.
Always make simple backups.
The backup suffix is ‘~’, unless set with --suffix
or the
SIMPLE_BACKUP_SUFFIX
environment variable.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Apply ref.pot to each of the domains in def.po.
Do not use fuzzy matching when an exact match is not found. This may speed up the operation considerably.
Keep the previous msgids of translated messages, marked with ‘#|’, when adding the fuzzy marker to such messages.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
Increase verbosity level.
Suppress progress indicators.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
8.1 Éditeur de fichier PO sous KDE | ||
8.2 Éditeur de fichier PO sous GNOME | ||
8.3 Éditeur Emacs pour les fichiers PO | ||
8.4 Utiliser un compendia de traduction |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Pour ceux d'entres-vous qui sont les heureux utilisateurs d'Emacs, le mode PO a été spécialement crée pour procurer un environnment confortable pour éditer ou modifier les fichiers PO. Quand vous éditez un fichier PO, le mode PO vous permet un parcours facile des fichiers compendium et des fichiers PO auxiliaires, comme il vous aide à suivre les références dans un jeu de source de programme C, depuis lesquels les fichiers PO ont été dérivés. Il a quelques fonctionalités spéciales, parmis lesquelles le marquage interactif des chaînes à traduire et la validation des fichiers PO avec un repositionnement facile sur la ligne du fichier PO qui montre l'erreur.
Pour débutter, à côté des commandes principales du mode PO (@pxref{commandes PO principales}), vous devriez savoir comment bouger entre les entrées (@pxref{positionnnement de l'entrée}) et comme gérer les entrées non traduites (@pxref{entrées non traduites}).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
Une fois que vous avez reçu, désempacté, configuré et compilé la
distribution GNU gettext
, la commande ‘make install’ met les
programmes xgettext
; msgfmt
, gettext
et
msgmerge
à leur place comme leur catalogues de messages
disponibles. Pour démarrer en plus avec une installation confortable, vous
voudrez aussi rendre le mode PO disponible pour les utilisateurs d'Emacs.
Pendant l'installation du mode PO, vous pouvez vouloir modifier votre fichier ‘.emacs’, une fois pour toute, de façon à ce qu'il contienne quelques lignes qui ressemblentt à ceci :
(setq auto-mode-alist (cons '("\\.po\\'\\|\\.po\\." . po-mode) auto-mode-alist)) (autoload 'po-mode "po-mode" "Major mode for translators to edit PO " files" t) |
Ensuite, à chaque fois que vous éditerez un fichier ‘.po’ ou tout fichier qui aura la chaîne ‘.po’ dans son nom, Emacs chargera ‘po-mode.elc’ (ou ‘po-mode.el’) selon les besoins et activera automatiquement les commandes du mode PO pour la zone tampon associée. La chaîne PO apparaît dans la ligne modale pour toute zone tampon, pour laquel le mode PO est actif. Beaucoup de fichier PO peuvent être actifs en même temps dans une seule session Emacs.
Si vous utilisez Emacs version 20 ou plus et si vous avez déjà installé les polices appropriées de caractères internationaux sur votre système, vous pourriez aussi demander à Emacs de savoir déterminer automatiquement le système de codage sur chaque fichier PO. Cela chargera le plus souvent (mais pas toujours) les polices de caractères nécessaires pour les utliliser pour afficher les traductions sur votre fenêtre Emacs. Pour que ceci se passe, ajoutez les lignes :
(modify-coding-system-alist 'file "\\.po\\'\\|\\.po\\." 'po-find-file-coding-system) (autoload 'po-find-file-coding-system "po-mode") |
à votre fichier ‘.emacs’. Si malgré ceci, vous voyez toujours des boites vides à la place de vos caractères internationaux, essayer un autre jeu de polices de caractères (via shift et bouton de la souri 1).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
in Achever l'installation de GNU gettext
, PO mode is activated for a window when Emacs finds a
PO file in that window. This puts the window read-only and establishes a
po-mode-map, which is a genuine Emacs mode, in a way that is not derived
from text mode in any way. Functions found on po-mode-hook
, if any,
will be executed.
When PO mode is active in a window, the letters ‘PO’ appear in the mode line for that window. The mode line also displays how many entries of each kind are held in the PO file. For example, the string ‘132t+3f+10u+2o’ would tell the translator that the PO mode contains 132 translated entries (@pxref{Translated Entries}, 3 fuzzy entries (@pxref{Fuzzy Entries}), 10 untranslated entries (@pxref{Untranslated Entries}) and 2 obsolete entries (@pxref{Obsolete Entries}). Zero-coefficients items are not shown. So, in this example, if the fuzzy entries were unfuzzied, the untranslated entries were translated and the obsolete entries were deleted, the mode line would merely display ‘145t’ for the counters.
The main PO commands are those which do not fit into the other categories of subsequent sections. These allow for quitting PO mode or for managing windows in special ways.
(po-undo
).
(po-quit
).
(po-confirm-and-quit
).
(po-other-window
).
about PO mode (po-help
).
(po-statistics
).
file (po-validate
).
command _ (po-undo
) interfaces to the Emacs undo
facility. Voir (emacs)Undo section `Undoing Changes' dans The Emacs Editor. Each
time U is typed, modifications which the translator did to the PO file
are undone a little more. For the purpose of undoing, each PO mode command
is atomic. This is especially true for the <RET> command: the
whole edition made by using a single use of this command is undone at once,
even if the edition itself implied several actions. However, while in the
editing window, one can undo the edition work quite parsimoniously.
po-quit, PO Mode command
command The commands Q (po-quit
) and q
(po-confirm-and-quit
) are used when the translator is done with the
PO file. The former is a bit less verbose than the latter. If the file has
been modified, it is saved to disk first. In both cases, and prior to all
this, the commands check if any untranslated messages remain in the PO file
and, if so, the translator is asked if she really wants to leave off working
with this PO file. This is the preferred way of getting rid of an Emacs PO
file buffer. Merely killing it through the usual command C-x k
(kill-buffer
) is not the tidiest way to proceed.
command The command 0 (po-other-window
) is another, softer
way, to leave PO mode, temporarily. It just moves the cursor to some other
Emacs window, and pops one if necessary. For example, if the translator
just got PO mode to show some source context in some other, she might
discover some apparent bug in the program source that needs correction.
This command allows the translator to change sex, become a programmer, and
have the cursor right into the window containing the program she (or rather
he) wants to modify. By later getting the cursor back in the PO file
window, or by asking Emacs to edit this file once again, PO mode is then
recovered.
po-help, PO Mode command The command h (po-help
) displays a
summary of all available PO mode commands. The translator should then type
any character to resume normal PO mode operations. The command ? has
the same effect as h.
The command = (po-statistics
) computes the total number of
entries in the PO file, the ordinal of the current entry (counted from 1),
the number of untranslated entries, the number of obsolete entries, and
displays all these numbers.
The command V (po-validate
) launches msgfmt
in checking
and verbose mode over the current PO file. This command first offers to
save the current PO file on disk. The msgfmt
tool, from GNU
gettext
, has the purpose of creating a MO file out of a PO file, and
PO mode uses the features of this program for checking the overall format of
a PO file, as well as all individual entries.
program msgfmt
runs asynchronously with Emacs, so the translator
regains control immediately while her PO file is being studied. Error
output is collected in the Emacs ‘*compilation*’ buffer, displayed in
another window. The regular Emacs command C-x` (next-error
),
as well as other usual compile commands, allow the translator to reposition
quickly to the offending parts of the PO file. Once the cursor is on the
line in error, the translator may decide on any PO mode action which would
help correcting the error.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
always part of an entry. The only exceptions are the special case when the cursor is after the last entry in the file, or when the PO file is empty. The entry where the cursor is found to be is said to be the current entry. Many PO mode commands operate on the current entry, so moving the cursor does more than allowing the translator to browse the PO file, this also selects on which entry commands operate.
of the cursor in a specialized way. A few of those special purpose positioning are described here, the others are described in following sections (for a complete list try C-h m):
(po-current-entry
).
(po-next-entry
).
(po-previous-entry
).
(po-first-entry
).
(po-last-entry
).
later use (po-push-location
).
(po-pop-location
).
previously saved one (po-exchange-location
).
command Any Emacs command able to reposition the cursor may be used to
select the current entry in PO mode, including commands which move by
characters, lines, paragraphs, screens or pages, and search commands.
However, there is a kind of standard way to display the current entry in PO
mode, which usual Emacs commands moving the cursor do not especially try to
enforce. The command . (po-current-entry
) has the sole purpose
of redisplaying the current entry properly, after the current entry has been
changed by means external to PO mode, or the Emacs screen otherwise altered.
It is yet to be decided if PO mode helps the translator, or otherwise irritates her, by forcing a rigid window disposition while she is doing her work. We originally had quite precise ideas about how windows should behave, but on the other hand, anyone used to Emacs is often happy to keep full control. Maybe a fixed window disposition might be offered as a PO mode option that the translator might activate or deactivate at will, so it could be offered on an experimental basis. If nobody feels a real need for using it, or a compulsion for writing it, we should drop this whole idea. The incentive for doing it should come from translators rather than programmers, as opinions from an experienced translator are surely more worth to me than opinions from programmers thinking about how others should do translation.
command The commands n (po-next-entry
) and p
(po-previous-entry
) move the cursor the entry following, or
preceding, the current one. If n is given while the cursor is on the
last entry of the PO file, or if p is given while the cursor is on the
first entry, no move is done.
The commands < (po-first-entry
) and >
(po-last-entry
) move the cursor to the first entry, or last entry, of
the PO file. When the cursor is located past the last entry in a PO file,
most PO mode commands will return an error saying ‘After last entry’.
Moreover, the commands < and > have the special property of
being able to work even when the cursor is not into some PO file entry, and
one may use them for nicely correcting this situation. But even these
commands will fail on a truly empty PO file. There are development plans
for the PO mode for it to interactively fill an empty PO file from sources.
@xref{Marking}.
The translator may decide, before working at the translation of a particular entry, that she needs to browse the remainder of the PO file, maybe for finding the terminology or phraseology used in related entries. She can of course use the standard Emacs idioms for saving the current cursor location in some register, and use that register for getting back, or else, use the location ring.
command
Mode command PO mode offers another approach, by which cursor locations may
be saved onto a special stack. The command m
(po-push-location
) merely adds the location of current entry to the
stack, pushing the already saved locations under the new one. The command
r (po-pop-location
) consumes the top stack element and
repositions the cursor to the entry associated with that top element. This
position is then lost, for the next r will move the cursor to the
previously saved location, and so on until no locations remain on the stack.
If the translator wants the position to be kept on the location stack, maybe for taking a look at the entry associated with the top element, then go elsewhere with the intent of getting back later, she ought to use m immediately after r.
command The command x (po-exchange-location
) simultaneously
repositions the cursor to the entry associated with the top element of the
stack of saved locations, and replaces that top element with the location of
the current entry before the move. Consequently, repeating the x
command toggles alternatively between two entries. For achieving this, the
translator will position the cursor on the first entry, use m, then
position to the second entry, and merely use x for making the switch.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
There are many different ways for encoding a particular string into a PO
file entry, because there are so many different ways to split and quote
multi-line strings, and even, to represent special characters by backslashed
escaped sequences. Some features of PO mode rely on the ability for PO mode
to scan an already existing PO file for a particular string encoded into the
msgid
field of some entry. Even if PO mode has internally all the
built-in machinery for implementing this recognition easily, doing it fast
is technically difficult. To facilitate a solution to this efficiency
problem, we decided on a canonical representation for strings.
A conventional representation of strings in a PO file is currently under
discussion, and PO mode experiments with a canonical representation. Having
both xgettext
and PO mode converging towards a uniform way of
representing equivalent strings would be useful, as the internal
normalization needed by PO mode could be automatically satisfied when using
xgettext
from GNU gettext
. An explicit PO mode normalization
should then be only necessary for PO files imported from elsewhere, or for
when the convention itself evolves.
So, for achieving normalization of at least the strings of a given PO file needing a canonical representation, the following PO mode command is available:
entries more uniform.
The special command M-x po-normalize, which has no associated keys,
revises all entries, ensuring that strings of both original and translated
entries use uniform internal quoting in the PO file. It also removes any
crumb after the last entry. This command may be useful for PO files freshly
imported from elsewhere, or if we ever improve on the canonical quoting
format we use. This canonical format is not only meant for getting cleaner
PO files, but also for greatly speeding up msgid
string lookup for
some other PO mode commands.
M-x po-normalize presently makes three passes over the entries. The
first implements heuristics for converting PO files for GNU gettext
0.6 and earlier, in which msgid
and msgstr
fields were using
K&R style C string syntax for multi-line strings. These heuristics may fail
for comments not related to obsolete entries and ending with a backslash;
they also depend on subsequent passes for finalizing the proper commenting
of continued lines for obsolete entries. This first pass might disappear
once all oldish PO files would have been adjusted. The second and third
pass normalize all msgid
and msgstr
strings respectively.
They also clean out those trailing backslashes used by XView's msgfmt
for continued lines.
Having such an explicit normalizing command allows for importing PO files
from other sources, but also eases the evolution of the current convention,
evolution driven mostly by aesthetic concerns, as of now. It is easy to
make suggested adjustments at a later time, as the normalizing command and
eventually, other GNU gettext
tools should greatly automate
conformance. A description of the canonical string format is given below,
for the particular benefit of those not having Emacs handy, and who would
nevertheless want to handcraft their PO files in nice ways.
Right now, in PO mode, strings are single line or multi-line. A string goes multi-line if and only if it has embedded newlines, that is, if it matches ‘[^\n]\n+[^\n]’. So, we would have:
msgstr "\n\nHello, world!\n\n\n" |
but, replacing the space by a newline, this becomes:
msgstr "" "\n" "\n" "Hello,\n" "world!\n" "\n" "\n" |
We are deliberately using a caricatural example, here, to make the point clearer. Usually, multi-lines are not that bad looking. It is probable that we will implement the following suggestion. We might lump together all initial newlines into the empty string, and also all newlines introducing empty lines (that is, for n > 1, the n-1'th last newlines would go together on a separate string), so making the previous example appear:
msgstr "\n\n" "Hello,\n" "world!\n" "\n\n" |
There are a few yet undecided little points about string normalization, to be documented in this manual, once these questions settle.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Each PO file entry for which the msgstr
field has been filled with a
translation, and which is not marked as fuzzy (@pxref{Fuzzy Entries}), is
said to be a translated entry. Only translated entries will later be
compiled by GNU msgfmt
and become usable in programs. Other entry
types will be excluded; translation will not occur for them.
related to translated entry processing.
(po-next-translated-entry
).
(po-previous-translated-entry
).
Mode command
po-previous-translated-entry, PO Mode command The commands t
(po-next-translated-entry
) and T
(po-previous-translated-entry
) move forwards or backwards, chasing
for an translated entry. If none is found, the search is extended and wraps
around in the PO file buffer.
usually result from the translator having edited in a translation for them,
@ref{Modifying Translations}. However, if the variable
po-auto-fuzzy-on-edit
is not nil
, the entry having received a
new translation first becomes a fuzzy entry, which ought to be later
unfuzzied before becoming an official, genuine translated entry.
@xref{Fuzzy Entries}.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Each PO file entry may have a set of attributes, which are qualities
given a name and explicitly associated with the translation, using a special
system comment. One of these attributes has the name fuzzy
, and
entries having this attribute are said to have a fuzzy translation. They
are called fuzzy entries, for short.
Fuzzy entries, even if they account for translated entries for most other
purposes, usually call for revision by the translator. Those may be
produced by applying the program msgmerge
to update an older
translated PO files according to a new PO template file, when this tool
hypothesises that some new msgid
has been modified only slightly out
of an older one, and chooses to pair what it thinks to be the old
translation for the new modified entry. The slight alteration in the
original string (the msgid
string) should often be reflected in the
translated string, and this requires the intervention of the translator.
For this reason, msgmerge
might mark some entries as being fuzzy.
mark an entry as fuzzy for her own convenience, when she wants to remember that the entry has to be later revisited. So, some commands are more specifically related to fuzzy entry processing.
(po-next-fuzzy-entry
).
(po-previous-fuzzy-entry
).
entry (po-unfuzzy
).
command
po-previous-fuzzy-entry, PO Mode command The commands z
(po-next-fuzzy-entry
) and Z (po-previous-fuzzy-entry
)
move forwards or backwards, chasing for a fuzzy entry. If none is found,
the search is extended and wraps around in the PO file buffer.
<TAB> (po-unfuzzy
) removes the fuzzy attribute associated
with an entry, usually leaving it translated. Further, if the variable
po-auto-select-on-unfuzzy
has not the nil
value, the
<TAB> command will automatically chase for another interesting
entry to work on. The initial value of po-auto-select-on-unfuzzy
is
nil
.
The initial value of po-auto-fuzzy-on-edit
is nil
. However,
if the variable po-auto-fuzzy-on-edit
is set to t
, any entry
edited through the <RET> command is marked fuzzy, as a way to
ensure some kind of double check, later. In this case, the usual paradigm
is that an entry becomes fuzzy (if not already) whenever the translator
modifies it. If she is satisfied with the translation, she then uses
<TAB> to pick another entry to work on, clearing the fuzzy
attribute on the same blow. If she is not satisfied yet, she merely uses
<SPC> to chase another entry, leaving the entry fuzzy.
command The translator may also use the <DEL> command
(po-fade-out-entry
) over any translated entry to mark it as being
fuzzy, when she wants to easily leave a trace she wants to later return
working at this entry.
Also, when time comes to quit working on a PO file buffer with the q command, the translator is asked for confirmation, if fuzzy string still exists.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
When xgettext
originally creates a PO file, unless told otherwise, it
initializes the msgid
field with the untranslated string, and leaves
the msgstr
string to be empty. Such entries, having an empty
translation, are said to be untranslated entries. Later, when the
programmer slightly modifies some string right in the program, this change
is later reflected in the PO file by the appearance of a new untranslated
entry for the modified string.
The usual commands moving from entry to entry consider untranslated entries on the same level as active entries. Untranslated entries are easily recognizable by the fact they end with ‘msgstr ""’.
(quite naively) seen as the process of seeking for an untranslated entry, editing a translation for it, and repeating these actions until no untranslated entries remain. Some commands are more specifically related to untranslated entry processing.
(po-next-untranslated-entry
).
(po-previous-untransted-entry
).
one (po-kill-msgstr
).
Mode command
po-previous-untransted-entry, PO Mode command The commands u
(po-next-untranslated-entry
) and U
(po-previous-untransted-entry
) move forwards or backwards, chasing
for an untranslated entry. If none is found, the search is extended and
wraps around in the PO file buffer.
An entry can be turned back into an untranslated entry by merely emptying
its translation, using the command k (po-kill-msgstr
).
@xref{Modifying Translations}.
Also, when time comes to quit working on a PO file buffer with the q command, the translator is asked for confirmation, if some untranslated string still exists.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
By obsolete PO file entries, we mean those entries which are commented
out, usually by msgmerge
when it found that the translation is not
needed anymore by the package being localized.
The usual commands moving from entry to entry consider obsolete entries on
the same level as active entries. Obsolete entries are easily recognizable
by the fact that all their lines start with #
, even those lines
containing msgid
or msgstr
.
Commands exist for emptying the translation or reinitializing it to the original untranslated string. Commands interfacing with the kill ring may force some previously saved text into the translation. The user may interactively edit the translation. All these commands may apply to obsolete entries, carefully leaving the entry obsolete after the fact.
specifically related to obsolete entry processing.
(po-next-obsolete-entry
).
(po-previous-obsolete-entry
).
an obsolete entry (po-fade-out-entry
).
command
po-previous-obsolete-entry, PO Mode command The commands o
(po-next-obsolete-entry
) and O
(po-previous-obsolete-entry
) move forwards or backwards, chasing for
an obsolete entry. If none is found, the search is extended and wraps
around in the PO file buffer.
PO mode does not provide ways for un-commenting an obsolete entry and making
it active, because this would reintroduce an original untranslated string
which does not correspond to any marked string in the program sources. This
goes with the philosophy of never introducing useless msgid
values.
command
However, it is possible to comment out an active entry, so making it
obsolete. GNU gettext
utilities will later react to the
disappearance of a translation by using the untranslated string. The
command <DEL> (po-fade-out-entry
) pushes the current entry
a little further towards annihilation. If the entry is active (it is a
translated entry), then it is first made fuzzy. If it is already fuzzy,
then the entry is merely commented out, with confirmation. If the entry is
already obsolete, then it is completely deleted from the PO file. It is
easy to recycle the translation so deleted into some other PO file entry,
usually one which is untranslated. @xref{Modifying Translations}.
Here is a quite interesting problem to solve for later development of PO mode, for those nights you are not sleepy. The idea would be that PO mode might become bright enough, one of these days, to make good guesses at retrieving the most probable candidate, among all obsolete entries, for initializing the translation of a newly appeared string. I think it might be a quite hard problem to do this algorithmically, as we have to develop good and efficient measures of string similarity. Right now, PO mode completely lets the decision to the translator, when the time comes to find the adequate obsolete translation, it merely tries to provide handy tools for helping her to do so.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
PO mode prevents direct modification of the PO file, by the usual means Emacs gives for altering a buffer's contents. By doing so, it pretends helping the translator to avoid little clerical errors about the overall file format, or the proper quoting of strings, as those errors would be easily made. Other kinds of errors are still possible, but some may be caught and diagnosed by the batch validation process, which the translator may always trigger by the V command. For all other errors, the translator has to rely on her own judgment, and also on the linguistic reports submitted to her by the users of the translated package, having the same mother tongue.
When the time comes to create a translation, correct an error diagnosed mechanically or reported by a user, the translators have to resort to using the following commands for modifying the translations.
(po-edit-msgstr
).
Reinitialize the translation with the original, untranslated string
(po-msgid-to-msgstr
).
delete it (po-kill-msgstr
).
without deleting it (po-kill-ring-save-msgstr
).
the kill ring (po-yank-msgstr
).
command The command <RET> (po-edit-msgstr
) opens a new
Emacs window meant to edit in a new translation, or to modify an already
existing translation. The new window contains a copy of the translation
taken from the current PO file entry, all ready for edition, expunged of all
quoting marks, fully modifiable and with the complete extent of Emacs
modifying commands. When the translator is done with her modifications, she
may use C-c C-c to close the subedit window with the automatically
requoted results, or C-c C-k to abort her modifications.
@xref{Subedit}, for more information.
po-msgid-to-msgstr, PO Mode command The command <LFD>
(po-msgid-to-msgstr
) initializes, or reinitializes the translation
with the original string. This command is normally used when the translator
wants to redo a fresh translation of the original string, disregarding any
previous work.
arrange so, whenever editing an untranslated entry, the <LFD>
command be automatically executed. If you set
po-auto-edit-with-msgid
to t
, the translation gets initialised
with the original string, in case none exists already. The default value
for po-auto-edit-with-msgid
is nil
.
a translation with an empty string, or rather with a copy of the original string, is a matter of taste or habit. Sometimes, the source language and the target language are so different that is simply best to start writing on an empty page. At other times, the source and target languages are so close that it would be a waste to retype a number of words already being written in the original string. A translator may also like having the original string right under her eyes, as she will progressively overwrite the original text with the translation, even if this requires some extra editing work to get rid of the original.
command
command
k (po-kill-msgstr
) merely empties the translation string, so
turning the entry into an untranslated one. But while doing so, its
previous contents is put apart in a special place, known as the kill ring.
The command w (po-kill-ring-save-msgstr
) has also the effect of
taking a copy of the translation onto the kill ring, but it otherwise leaves
the entry alone, and does not remove the translation from the entry.
Both commands use exactly the Emacs kill ring, which is shared between
buffers, and which is well known already to Emacs lovers.
The translator may use k or w many times in the course of her work, as the kill ring may hold several saved translations. From the kill ring, strings may later be reinserted in various Emacs buffers. In particular, the kill ring may be used for moving translation strings between different entries of a single PO file buffer, or if the translator is handling many such buffers at once, even between PO files.
To facilitate exchanges with buffers which are not in PO mode, the translation string put on the kill ring by the k command is fully unquoted before being saved: external quotes are removed, multi-line strings are concatenated, and backslash escaped sequences are turned into their corresponding characters. In the special case of obsolete entries, the translation is also uncommented prior to saving.
The command y (po-yank-msgstr
) completely replaces the
translation of the current entry by a string taken from the kill ring.
Following Emacs terminology, we then say that the replacement string is
yanked into the PO file buffer. Voir (emacs)Yanking section `Yanking' dans The Emacs Editor. The first time y is used, the translation receives the value
of the most recent addition to the kill ring. If y is typed once
again, immediately, without intervening keystrokes, the translation just
inserted is taken away and replaced by the second most recent addition to
the kill ring. By repeating y many times in a row, the translator may
travel along the kill ring for saved strings, until she finds the string she
really wanted.
When a string is yanked into a PO file entry, it is fully and automatically requoted for complying with the format PO files should have. Further, if the entry is obsolete, PO mode then appropriately push the inserted string inside comments. Once again, translators should not burden themselves with quoting considerations besides, of course, the necessity of the translated string itself respective to the program using it.
Note that k or w are not the only commands pushing strings on the kill ring, as almost any PO mode command replacing translation strings (or the translator comments) automatically saves the old string on the kill ring. The main exceptions to this general rule are the yanking commands themselves.
illustrate the operation of killing and yanking, let's use an actual
example, taken from a common situation. When the programmer slightly
modifies some string right in the program, his change is later reflected in
the PO file by the appearance of a new untranslated entry for the modified
string, and the fact that the entry translating the original or unmodified
string becomes obsolete. In many cases, the translator might spare herself
some work by retrieving the unmodified translation from the obsolete entry,
then initializing the untranslated entry msgstr
field with this
retrieved translation. Once this done, the obsolete entry is not wanted
anymore, and may be safely deleted.
When the translator finds an untranslated entry and suspects that a slight
variant of the translation exists, she immediately uses m to mark the
current entry location, then starts chasing obsolete entries with o,
hoping to find some translation corresponding to the unmodified string.
Once found, she uses the <DEL> command for deleting the obsolete
entry, knowing that <DEL> also kills the translation, that
is, pushes the translation on the kill ring. Then, r returns to the
initial untranslated entry, and y then yanks the saved
translation right into the msgstr
field. The translator is then free
to use <RET> for fine tuning the translation contents, and maybe
to later use u, then m again, for going on with the next
untranslated string.
When some sequence of keys has to be typed over and over again, the translator may find it useful to become better acquainted with the Emacs capability of learning these sequences and playing them back under request. Voir (emacs)Keyboard Macros section `Keyboard Macros' dans The Emacs Editor.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Any translation work done seriously will raise many linguistic difficulties, for which decisions have to be made, and the choices further documented. These documents may be saved within the PO file in form of translator comments, which the translator is free to create, delete, or modify at will. These comments may be useful to herself when she returns to this PO file after a while.
Comments not having whitespace after the initial ‘#’, for example,
those beginning with ‘#.’ or ‘#:’, are not translator
comments, they are exclusively created by other gettext
tools. So,
the commands below will never alter such system added comments, they are not
meant for the translator to modify. @xref{PO Files}.
The following commands are somewhat similar to those modifying translations, so the general indications given for those apply here. @xref{Modifying Translations}.
(po-edit-comment
).
ring, and delete it (po-kill-comment
).
ring, without deleting it (po-kill-ring-save-comment
).
new from the kill ring (po-yank-comment
).
These commands parallel PO mode commands for modifying the translation strings, and behave much the same way as they do, except that they handle this part of PO file comments meant for translator usage, rather than the translation strings. So, if the descriptions given below are slightly succinct, it is because the full details have already been given. @xref{Modifying Translations}.
command The command # (po-edit-comment
) opens a new Emacs
window containing a copy of the translator comments on the current PO file
entry. If there are no such comments, PO mode understands that the
translator wants to add a comment to the entry, and she is presented with an
empty screen. Comment marks (#
) and the space following them are
automatically removed before edition, and reinstated after. For translator
comments pertaining to obsolete entries, the uncommenting and recommenting
operations are done twice. Once in the editing window, the keys C-c
C-c allow the translator to tell she is finished with editing the
comment. @xref{Subedit}, for further details.
po-subedit-mode-hook
, if any, are executed after the string has been
inserted in the edit buffer.
command
po-kill-ring-save-comment, PO Mode command
command
(po-kill-comment
) gets rid of all translator comments, while saving
those comments on the kill ring. The command W
(po-kill-ring-save-comment
) takes a copy of the translator comments
on the kill ring, but leaves them undisturbed in the current entry. The
command Y (po-yank-comment
) completely replaces the translator
comments by a string taken at the front of the kill ring. When this command
is immediately repeated, the comments just inserted are withdrawn, and
replaced by other strings taken along the kill ring.
On the kill ring, all strings have the same nature. There is no distinction between translation strings and translator comments strings. So, for example, let's presume the translator has just finished editing a translation, and wants to create a new translator comment to document why the previous translation was not good, just to remember what was the problem. Foreseeing that she will do that in her documentation, the translator may want to quote the previous translation in her translator comments. To do so, she may initialize the translator comments with the previous translation, still at the head of the kill ring. Because editing already pushed the previous translation on the kill ring, she merely has to type M-w prior to #, and the previous translation will be right there, all ready for being introduced by some explanatory text.
On the other hand, presume there are some translator comments already and
that the translator wants to add to those comments, instead of wholly
replacing them. Then, she should edit the comment right away with #.
Once inside the editing window, she can use the regular Emacs commands
C-y (yank
) and M-y (yank-pop
) to get the previous
translation where she likes.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The PO subedit minor mode has a few peculiarities worth being described in fuller detail. It installs a few commands over the usual editing set of Emacs, which are described below.
(po-subedit-exit
).
(po-subedit-abort
).
(po-subedit-cycle-auxiliary
).
po-subedit-exit, PO Mode command The window's contents represents a
translation for a given message, or a translator comment. The translator
may modify this window to her heart's content. Once this is done, the
command C-c C-c (po-subedit-exit
) may be used to return the
edited translation into the PO file, replacing the original translation,
even if it moved out of sight or if buffers were switched.
command If the translator becomes unsatisfied with her translation or
comment, to the extent she prefers keeping what was existent prior to the
<RET> or # command, she may use the command C-c
C-k (po-subedit-abort
) to merely get rid of edition, while
preserving the original translation or comment. Another way would be for
her to exit normally with C-c C-c, then type U
once for
undoing the whole effect of last edition.
po-subedit-cycle-auxiliary, PO Mode command The command C-c
C-a (po-subedit-cycle-auxiliary
) allows for glancing through
translations already achieved in other languages, directly while editing the
current translation. This may be quite convenient when the translator is
fluent at many languages, but of course, only makes sense when such
completed auxiliary PO files are already available to her
(@pxref{Auxiliary}).
Functions found on po-subedit-mode-hook
, if any, are executed after
the string has been inserted in the edit buffer.
While editing her translation, the translator should pay attention to not
inserting unwanted <RET> (newline) characters at the end of the
translated string if those are not meant to be there, or to removing such
characters when they are required. Since these characters are not visible
in the editing buffer, they are easily introduced by mistake. To help her,
<RET> automatically puts the character <
at the end of the
string being edited, but this <
is not really part of the string. On
exiting the editing window with C-c C-c, PO mode automatically
removes such < and all whitespace added after it. If the translator
adds characters after the terminating <
, it looses its delimiting
property and integrally becomes part of the string. If she removes the
delimiting <
, then the edited string is taken as is, with all
trailing newlines, even if invisible. Also, if the translated string ought
to end itself with a genuine <
, then the delimiting <
may not
be removed; so the string should appear, in the editing window, as ending
with two <
in a row.
edited, the translator may move the cursor back into the PO file buffer and freely move to other entries, browsing at will. If, with an edition pending, the translator wanders in the PO file buffer, she may decide to start modifying another entry. Each entry being edited has its own subedit buffer. It is possible to simultaneously edit the translation and the comment of a single entry, or to edit entries in different PO files, all at once. Typing <RET> on a field already being edited merely resumes that particular edit. Yet, the translator should better be comfortable at handling many Emacs windows!
any order, regardless of how or when they were started. When many subedits are pending and the translator asks for quitting the PO file (with the q command), subedits are automatically resumed one at a time, so she may decide for each of them.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
translation
PO mode is particularly powerful when used with PO files created through GNU
gettext
utilities, as those utilities insert special comments in the
PO files they generate. Some of these special comments relate the PO file
entry to exactly where the untranslated string appears in the program
sources.
When the translator gets to an untranslated entry, she is fairly often faced with an original string which is not as informative as it normally should be, being succinct, cryptic, or otherwise ambiguous. Before choosing how to translate the string, she needs to understand better what the string really means and how tight the translation has to be. Most of the time, when problems arise, the only way left to make her judgment is looking at the true program sources from where this string originated, searching for surrounding comments the programmer might have put in there, and looking around for helping clues of any kind.
Surely, when looking at program sources, the translator will receive more help if she is a fluent programmer. However, even if she is not versed in programming and feels a little lost in C code, the translator should not be shy at taking a look, once in a while. It is most probable that she will still be able to find some of the hints she needs. She will learn quickly to not feel uncomfortable in program code, paying more attention to programmer's comments, variable and function names (if he dared choosing them well), and overall organization, than to the program code itself.
meant to help the translator at getting program source context for a PO file entry.
context, or cycle through them (po-cycle-source-reference
).
selected by menu (po-select-source-reference
).
source files (po-consider-source-path
).
for source files (po-ignore-source-path
).
Mode command
po-select-source-reference, PO Mode command The commands s
(po-cycle-source-reference
) and M-s
(po-select-source-reference
) both open another window displaying some
source program file, and already positioned in such a way that it shows an
actual use of the string to be translated. By doing so, the command gives
source program context for the string. But if the entry has no source
context references, or if all references are unresolved along the search
path for program sources, then the command diagnoses this as an error.
Even if s (or M-s) opens a new window, the cursor stays in the PO file window. If the translator really wants to get into the program source window, she ought to do it explicitly, maybe by using command O.
When s is typed for the first time, or for a PO file entry which is different of the last one used for getting source context, then the command reacts by giving the first context available for this entry, if any. If some context has already been recently displayed for the current PO file entry, and the translator wandered off to do other things, typing s again will merely resume, in another window, the context last displayed. In particular, if the translator moved the cursor away from the context in the source file, the command will bring the cursor back to the context. By using s many times in a row, with no other commands intervening, PO mode will cycle to the next available contexts for this particular entry, getting back to the first context once the last has been shown.
The command M-s behaves differently. Instead of cycling through references, it lets the translator choose a particular reference among many, and displays that reference. It is best used with completion, if the translator types <TAB> immediately after M-s, in response to the question, she will be offered a menu of all possible references, as a reminder of which are the acceptable answers. This command is useful only where there are really many contexts available for a single string to translate.
command
po-ignore-source-path, PO Mode command Program source files are usually
found relative to where the PO file stands. As a special provision, when
this fails, the file is also looked for, but relative to the directory
immediately above it. Those two cases take proper care of most PO files.
However, it might happen that a PO file has been moved, or is edited in a
different place than its normal location. When this happens, the translator
should tell PO mode in which directory normally sits the genuine PO file.
Many such directories may be specified, and all together, they constitute
what is called the search path for program sources. The command
S (po-consider-source-path
) is used to interactively enter a
new directory at the front of the search path, and the command M-S
(po-ignore-source-path
) is used to select, with completion, one of
the directories she does not want anymore on the search path.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
PO mode is able to help the knowledgeable translator, being fluent in many languages, at taking advantage of translations already achieved in other languages she just happens to know. It provides these other language translations as additional context for her own work. Moreover, it has features to ease the production of translations for many languages at once, for translators preferring to work in this way.
meant for the same package the translator is working on, but targeted to a different mother tongue language. Commands exist for declaring and handling auxiliary PO files, and also for showing contexts for the entry under work.
Here are the auxiliary file commands available in PO mode.
for the same entry (po-cycle-auxiliary
).
(po-select-auxiliary
).
(po-consider-as-auxiliary
).
auxiliary files (po-ignore-as-auxiliary
).
Mode command
po-ignore-as-auxiliary, PO Mode command Command A
(po-consider-as-auxiliary
) adds the current PO file to the list of
auxiliary files, while command M-A (po-ignore-as-auxiliary
just
removes it.
command The command a (po-cycle-auxiliary
) seeks all auxiliary
PO files, round-robin, searching for a translated entry in some other
language having an msgid
field identical as the one for the current
entry. The found PO file, if any, takes the place of the current PO file in
the display (its window gets on top). Before doing so, the current PO file
is also made into an auxiliary file, if not already. So, a in this
newly displayed PO file will seek another PO file, and so on, so repeating
a will eventually yield back the original PO file.
Mode command The command C-c C-a (po-select-auxiliary
) asks
the translator for her choice of a particular auxiliary file, with
completion, and then switches to that selected PO file. The command also
checks if the selected file has an msgid
field identical as the one
for the current entry, and if yes, this entry becomes current. Otherwise,
the cursor of the selected file is left undisturbed.
For all this to work fully, auxiliary PO files will have to be normalized,
in that way that msgid
fields should be written exactly the
same way. It is possible to write msgid
fields in various ways for
representing the same string, different writing would break the proper
behaviour of the auxiliary file commands of PO mode. This is not expected
to be much a problem in practice, as most existing PO files have their
msgid
entries written by the same GNU gettext
tools.
by PO mode itself, while marking strings in source files, are normalised
differently. So are PO files resulting of the ‘M-x normalize’
command. Until these discrepancies between PO mode and other GNU
gettext
tools get fully resolved, the translator should stay aware of
normalisation issues.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
A compendium is a special PO file containing a set of translations recurring in many different packages. The translator can use gettext tools to build a new compendium, to add entries to her compendium, and to initialize untranslated entries, or to update already translated entries, from translations kept in the compendium.
8.4.1 Création d'un compendia | Fusion des traductions pour une utilisation ultérieure | |
8.4.2 Utilisation d'un comendia | Utilisation de vieille traduction dans le cas où elles conviennent |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Basically every PO file consisting of translated entries only can be declared as a valid compendium. Often the translator wants to have special compendia; let's consider two cases: concatenating PO files and extracting a message subset from a PO file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
To concatenate several valid PO files into one compendium file you can use ‘msgcomm’ or ‘msgcat’ (the latter preferred):
msgcat -o compendium.po file1.po file2.po |
By default, msgcat
will accumulate divergent translations for the
same string. Those occurrences will be marked as fuzzy
and highly
visible decorated; calling msgcat
on ‘file1.po’:
#: src/hello.c:200 #, c-format msgid "Report bugs to <%s>.\n" msgstr "Comunicar `bugs' a <%s>.\n" |
and ‘file2.po’:
#: src/bye.c:100 #, c-format msgid "Report bugs to <%s>.\n" msgstr "Comunicar \"bugs\" a <%s>.\n" |
will result in:
#: src/hello.c:200 src/bye.c:100 #, fuzzy, c-format msgid "Report bugs to <%s>.\n" msgstr "" "#-#-#-#-# file1.po #-#-#-#-#\n" "Comunicar `bugs' a <%s>.\n" "#-#-#-#-# file2.po #-#-#-#-#\n" "Comunicar \"bugs\" a <%s>.\n" |
The translator will have to resolve this “conflict” manually; she has to
decide whether the first or the second version is appropriate (or provide a
new translation), to delete the “marker lines”, and finally to remove the
fuzzy
mark.
If the translator knows in advance the first found translation of a message is always the best translation she can make use to the ‘--use-first’ switch:
msgcat --use-first -o compendium.po file1.po file2.po |
A good compendium file must not contain fuzzy
or untranslated
entries. If input files are “dirty” you must preprocess the input files
or postprocess the result using ‘msgattrib --translated --no-fuzzy’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Nobody wants to translate the same messages again and again; thus you may wish to have a compendium file containing ‘getopt.c’ messages.
To extract a message subset (e.g., all ‘getopt.c’ messages) from an existing PO file into one compendium file you can use ‘msggrep’:
msggrep --location src/getopt.c -o compendium.po file.po |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
You can use a compendium file to initialize a translation from scratch or to update an already existing translation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Since a PO file with translations does not exist the translator can merely use ‘/dev/null’ to fake the “old” translation file.
msgmerge --compendium compendium.po -o file.po /dev/null file.pot |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Concatenate the compendium file(s) and the existing PO, merge the result with the POT file and remove the obsolete entries (optional, here done using ‘sed’):
msgcat --use-first -o update.po compendium1.po compendium2.po file.po msgmerge update.po file.pot | msgattrib --no-obsolete > file.po |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Sometimes it is necessary to manipulate PO files in a way that is better
performed automatically than by hand. GNU gettext
includes a
complete set of tools for this purpose.
When merging two packages into a single package, the resulting POT file will be the concatenation of the two packages' POT files. Thus the maintainer must concatenate the two existing package translations into a single translation catalog, for each language. This is best performed using ‘msgcat’. It is then the translators' duty to deal with any possible conflicts that arose during the merge.
When a translator takes over the translation job from another translator, but she uses a different character encoding in her locale, she will convert the catalog to her character encoding. This is best done through the ‘msgconv’ program.
When a maintainer takes a source file with tagged messages from another package, he should also take the existing translations for this source file (and not let the translators do the same job twice). One way to do this is through ‘msggrep’, another is to create a POT file for that source file and use ‘msgmerge’.
When a translator wants to adjust some translation catalog for a special dialect or orthography — for example, German as written in Switzerland versus German as written in Germany — she needs to apply some text processing to every message in the catalog. The tool for doing this is ‘msgfilter’.
Another use of msgfilter
is to produce approximately the POT file for
which a given PO file was made. This can be done through a filter command
like ‘msgfilter sed -e d | sed -e '/^# /d'’. Note that the original
POT file may have had different comments and different plural message
counts, that's why it's better to use the original POT file if available.
When a translator wants to check her translations, for example according to orthography rules or using a non-interactive spell checker, she can do so using the ‘msgexec’ program.
When third party tools create PO or POT files, sometimes duplicates cannot
be avoided. But the GNU gettext
tools give an error when they
encounter duplicate msgids in the same file and in the same domain. To
merge duplicates, the ‘msguniq’ program can be used.
‘msgcomm’ is a more general tool for keeping or throwing away duplicates, occurring in different files.
‘msgcmp’ can be used to check whether a translation catalog is completely translated.
‘msgattrib’ can be used to select and extract only the fuzzy or untranslated messages of a translation catalog.
‘msgen’ is useful as a first step for preparing English translation catalogs. It copies each message's msgid to its msgstr.
Finally, for those applications where all these various programs are not sufficient, a library ‘libgettextpo’ is provided that can be used to write other specialized programs that process PO files.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgcat
msgcat [option] [inputfile]... |
The msgcat
program concatenates and merges the specified PO files.
It finds messages which are common to two or more of the specified PO
files. By using the --more-than
option, greater commonality may be
requested before messages are printed. Conversely, the --less-than
option may be used to specify less commonality before messages are printed
(i.e. ‘--less-than=2’ will only print the unique messages).
Translations, comments and extract comments will be cumulated, except that
if --use-first
is specified, they will be taken from the first PO
file to define them. File positions from all PO files will be cumulated.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input files.
Read the names of the input files from file instead of getting them from the command line.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Print messages with less than number definitions, defaults to infinite if not set.
Print messages with more than number definitions, defaults to 0 if not set.
Shorthand for ‘--less-than=2’. Requests that only unique messages be printed.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify encoding for output.
Use first available translation for each message. Don't merge several translations into one.
Specify whether or when to use colors and other text attributes. See @ref{The --color option} for details.
Specify the CSS style rule file to use for --color
. See @ref{The
--style option} for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgconv
msgconv [option] [inputfile] |
The msgconv
program converts a translation catalog to a different
character encoding.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify encoding for output.
The default encoding is the current locale's encoding.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msggrep
msggrep [option] [inputfile] |
The msggrep
program extracts all messages of a translation catalog
that match a given pattern or belong to some given source files.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
[-N sourcefile]... [-M domainname]... [-J msgctxt-pattern] [-K msgid-pattern] [-T msgstr-pattern] [-C comment-pattern] |
A message is selected if
When more than one selection criterion is specified, the set of selected messages is the union of the selected messages of each criterion.
msgctxt-pattern or msgid-pattern or msgstr-pattern syntax:
[-E | -F] [-e pattern | -f file]... |
patterns are basic regular expressions by default, or extended regular expressions if -E is given, or fixed strings if -F is given.
Select messages extracted from sourcefile. sourcefile can be either a literal file name or a wildcard pattern.
Select messages belonging to domain domainname.
Start of patterns for the msgctxt.
Start of patterns for the msgid.
Start of patterns for the msgstr.
Start of patterns for the translator's comment.
Start of patterns for the extracted comments.
Specify that pattern is an extended regular expression.
Specify that pattern is a set of newline-separated strings.
Use pattern as a regular expression.
Obtain pattern from file.
Ignore case distinctions.
Output only the messages that do not match any selection criterion, instead of the messages that match a selection criterion.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
To extract the messages that come from the source files
gnulib-lib/error.c
and gnulib-lib/getopt.c
:
msggrep -N gnulib-lib/error.c -N gnulib-lib/getopt.c input.po |
To extract the messages that contain the string “Please specify” in the original string:
msggrep --msgid -F -e 'Please specify' input.po |
To extract the messages that have a context specifier of either “Menu>File” or “Menu>Edit” or a submenu of them:
msggrep --msgctxt -E -e '^Menu>(File|Edit)' input.po |
To extract the messages whose translation contains one of the strings in the
file wordlist.txt
:
msggrep --msgstr -F -f wordlist.txt input.po |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgfilter
msgfilter [option] filter [filter-option] |
The msgfilter
program applies a filter to all translations of a
translation catalog.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The filter can be any program that reads a translation from standard input and writes a modified translation to standard output. A frequently used filter is ‘sed’. A few particular built-in filters are also recognized.
Note: If the filter is not a built-in filter, you have to care about
encodings: It is your responsibility to ensure that the filter can
cope with input encoded in the translation catalog's encoding. If the
filter wants input in a particular encoding, you can in a first step
convert the translation catalog to that encoding using the ‘msgconv’
program, before invoking ‘msgfilter’. If the filter wants input
in the locale's encoding, but you want to avoid the locale's encoding, then
you can first convert the translation catalog to UTF-8 using the
‘msgconv’ program and then make ‘msgfilter’ work in an UTF-8
locale, by using the LC_ALL
environment variable.
Note: Most translations in a translation catalog don't end with a newline
character. For this reason, it is important that the filter
recognizes its last input line even if it ends without a newline, and that
it doesn't add an undesired trailing newline at the end. The ‘sed’
program on some platforms is known to ignore the last line of input if it is
not terminated with a newline. You can use GNU sed
instead; it does
not have this limitation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Add script to the commands to be executed.
Add the contents of scriptfile to the commands to be executed.
Suppress automatic printing of pattern space.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The filter ‘recode-sr-latin’ is recognized as a built-in filter. The command ‘recode-sr-latin’ converts Serbian text, written in the Cyrillic script, to the Latin script. The command ‘msgfilter recode-sr-latin’ applies this conversion to the translations of a PO file. Thus, it can be used to convert an ‘sr.po’ file to an ‘sr@latin.po’ file.
The use of built-in filters is not sensitive to the current locale's encoding. Moreover, when used with a built-in filter, ‘msgfilter’ can automatically convert the message catalog to the UTF-8 encoding when needed.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Keep the header entry, i.e. the message with ‘msgid ""’, unmodified, instead of filtering it. By default, the header entry is subject to filtering like any other message.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
To convert German translations to Swiss orthography (in an UTF-8 locale):
msgconv -t UTF-8 de.po | msgfilter sed -e 's/ß/ss/g' |
To convert Serbian translations in Cyrillic script to Latin script:
msgfilter recode-sr-latin sr.po |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msguniq
msguniq [option] [inputfile] |
The msguniq
program unifies duplicate translations in a translation
catalog. It finds duplicate translations of the same message ID. Such
duplicates are invalid input for other programs like msgfmt
,
msgmerge
or msgcat
. By default, duplicates are merged
together. When using the ‘--repeated’ option, only duplicates are
output, and all other messages are discarded. Comments and extracted
comments will be cumulated, except that if ‘--use-first’ is specified,
they will be taken from the first translation. File positions will be
cumulated. When using the ‘--unique’ option, duplicates are discarded.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Print only duplicates.
Print only unique messages, discard duplicates.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify encoding for output.
Use first available translation for each message. Don't merge several translations into one.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgcomm
msgcomm [option] [inputfile]... |
The msgcomm
program finds messages which are common to two or more of
the specified PO files. By using the --more-than
option, greater
commonality may be requested before messages are printed. Conversely, the
--less-than
option may be used to specify less commonality before
messages are printed (i.e. ‘--less-than=2’ will only print the unique
messages). Translations, comments and extract comments will be preserved,
but only from the first PO file to define them. File positions from all PO
files will be cumulated.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input files.
Read the names of the input files from file instead of getting them from the command line.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Print messages with less than number definitions, defaults to infinite if not set.
Print messages with more than number definitions, defaults to 1 if not set.
Shorthand for ‘--less-than=2’. Requests that only unique messages be printed.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Don't write header with ‘msgid ""’ entry.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgcmp
msgcmp [option] def.po ref.pot |
The msgcmp
program compares two Uniforum style .po files to check
that both contain the same set of msgid strings. The def.po file is
an existing PO file with the translations. The ref.pot file is the
last created PO file, or a PO Template file (generally created by
xgettext
). This is useful for checking that you have translated each
and every message in your program. Where an exact match cannot be found,
fuzzy matching is used to produce better diagnostics.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Translations.
References to the sources.
Add directory to the list of directories. Source files are searched relative to this list of directories.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Apply ref.pot to each of the domains in def.po.
Consider fuzzy messages in the def.po file like translated messages. Note that using this option is usually wrong, because fuzzy messages are exactly those which have not been validated by a human translator.
Consider untranslated messages in the def.po file like translated messages. Note that using this option is usually wrong.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgattrib
msgattrib [option] [inputfile] |
The msgattrib
program filters the messages of a translation catalog
according to their attributes, and manipulates the attributes.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Keep translated messages, remove untranslated messages.
Keep untranslated messages, remove translated messages.
Remove ‘fuzzy’ marked messages.
Keep ‘fuzzy’ marked messages, remove all other messages.
Remove obsolete #~ messages.
Keep obsolete #~ messages, remove all other messages.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Attributes are modified after the message selection/removal has been performed. If the ‘--only-file’ or ‘--ignore-file’ option is specified, the attribute modification is applied only to those messages that are listed in the only-file and not listed in the ignore-file.
Set all messages ‘fuzzy’.
Set all messages non-‘fuzzy’.
Set all messages obsolete.
Set all messages non-obsolete.
Remove the “previous msgid” (‘#|’) comments from all messages.
Limit the attribute changes to entries that are listed in file. file should be a PO or POT file.
Limit the attribute changes to entries that are not listed in file. file should be a PO or POT file.
Synonym for ‘--only-fuzzy --clear-fuzzy’: It keeps only the fuzzy messages and removes their ‘fuzzy’ mark.
Synonym for ‘--only-obsolete --clear-obsolete’: It keeps only the obsolete messages and makes them non-obsolete.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgen
msgen [option] inputfile |
The msgen
program creates an English translation catalog. The input
file is the last created English PO file, or a PO Template file (generally
created by xgettext). Untranslated entries are assigned a translation that
is identical to the msgid.
Note: ‘msginit --no-translator --locale=en’ performs a very similar
task. The main difference is that msginit
cares specially about the
header entry, whereas msgen
doesn't.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO or POT file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If inputfile is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgexec
msgexec [option] command [command-option] |
The msgexec
program applies a command to all translations of a
translation catalog. The command can be any program that reads a
translation from standard input. It is invoked once for each translation.
Its output becomes msgexec's output. msgexec
's return code is the
maximum return code across all invocations.
A special builtin command called ‘0’ outputs the translation, followed by a null byte. The output of ‘msgexec 0’ is suitable as input for ‘xargs -0’.
During each command invocation, the environment variable
MSGEXEC_MSGID
is bound to the message's msgid, and the environment
variable MSGEXEC_LOCATION
is bound to the location in the PO file of
the message. If the message has a context, the environment variable
MSGEXEC_MSGCTXT
is bound to the message's msgctxt, otherwise it is
unbound.
Note: It is your responsibility to ensure that the command can cope
with input encoded in the translation catalog's encoding. If the
command wants input in a particular encoding, you can in a first step
convert the translation catalog to that encoding using the ‘msgconv’
program, before invoking ‘msgexec’. If the command wants input
in the locale's encoding, but you want to avoid the locale's encoding, then
you can first convert the translation catalog to UTF-8 using the
‘msgconv’ program and then make ‘msgexec’ work in an UTF-8 locale,
by using the LC_ALL
environment variable.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Translators are usually only interested in seeing the untranslated and fuzzy messages of a PO file. Also, when a message is set fuzzy because the msgid changed, they want to see the differences between the previous msgid and the current one (especially if the msgid is long and only few words in it have changed). Finally, it's always welcome to highlight the different sections of a message in a PO file (comments, msgid, msgstr, etc.).
Such highlighting is possible through the msgcat
options
‘--color’ and ‘--style’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
--color
option The ‘--color=when’ option specifies under which conditions colorized output should be generated. The when part can be one of the following:
always
yes
The output will be colorized.
never
no
The output will not be colorized.
auto
tty
The output will be colorized if the output device is a tty, i.e. when the output goes directly to a text screen or terminal emulator window.
html
The output will be colorized and be in HTML format.
‘--color’ is equivalent to ‘--color=yes’. The default is ‘--color=auto’.
Thus, a command like ‘msgcat vi.po’ will produce colorized output when called by itself in a command window. Whereas in a pipe, such as ‘msgcat vi.po | less -R’, it will not produce colorized output. To get colorized output in this situation nevertheless, use the command ‘msgcat --color vi.po | less -R’.
The ‘--color=html’ option will produce output that can be viewed in a browser. This can be useful, for example, for Indic languages, because the renderic of Indic scripts in browser is usually better than in terminal emulators.
Note that the output produced with the --color
option is not a
valid PO file in itself. It contains additional terminal-specific escape
sequences or HTML tags. A PO file reader will give a syntax error when
confronted with such content. Except for the ‘--color=html’ case, you
therefore normally don't need to save output produced with the
--color
option in a file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
TERM
The environment variable TERM
contains a identifier for the text
window's capabilities. You can get a detailed list of these cababilities by
using the ‘infocmp’ command, using ‘man 5 terminfo’ as a
reference.
When producing text with embedded color directives, msgcat
looks at
the TERM
variable. Text windows today typically support at least 8
colors. Often, however, the text window supports 16 or more colors, even
though the TERM
variable is set to a identifier denoting only 8
supported colors. It can be worth setting the TERM
variable to a
different value in these cases:
xterm
xterm
is in most cases built with support for 16 colors. It can also
be built with support for 88 or 256 colors (but not both). You can try to
set TERM
to either xterm-16color
, xterm-88color
, or
xterm-256color
.
rxvt
rxvt
is often built with support for 16 colors. You can try to set
TERM
to rxvt-16color
.
konsole
konsole
too is often built with support for 16 colors. You can try
to set TERM
to konsole-16color
or xterm-16color
.
After setting TERM
, you can verify it by invoking ‘msgcat
--color=test’ and seeing whether the output looks like a reasonable color
map.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
--style
The ‘--style=style_file’ option specifies the style file to use
when colorizing. It has an effect only when the --color
option is
effective.
If the --style
option is not specified, the environment variable
PO_STYLE
is considered. It is meant to point to the user's preferred
style for PO files.
The default style file is
‘$prefix/share/gettext/styles/po-default.css’, where $prefix
is
the installation location.
A few style files are predefined:
This style imitates the look used by vim 7.
This style imitates the look used by GNU Emacs 21 and 22 in an X11 window.
This style imitates the look used by GNU Emacs 22 in a terminal of type ‘xterm’ (8 colors) or ‘xterm-16color’ (16 colors) or ‘xterm-256color’ (256 colors), respectively.
You can use these styles without specifying a directory. They are actually
located in ‘$prefix/share/gettext/styles/’, where $prefix
is the
installation location.
You can also design your own styles. This is described in the next section.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The same style file can be used for styling of a PO file, for terminal output and for HTML output. It is written in CSS (Cascading Style Sheet) syntax. See http://www.w3.org/TR/css2/cover.html for a formal definition of CSS. Many HTML authoring tutorials also contain explanations of CSS.
In the case of HTML output, the style file is embedded in the HTML output.
In the case of text output, the style file is interpreted by the
msgcat
program. This means, in particular, that when @import
is used with relative file names, the file names are
@import
, in the case of
text output. (Actually, @import
s are not yet supported in this
case, due to a limitation in libcroco
.)
CSS rules are built up from selectors and declarations. The declarations specify graphical properties; the selectors specify specify when they apply.
In PO files, the following simple selectors (based on "CSS classes", see the CSS2 spec, section 5.8.3) are supported.
.header
This matches the header entry of a PO file.
.translated
This matches a translated message.
.untranslated
This matches an untranslated message (i.e. a message with empty translation).
.fuzzy
This matches a fuzzy message (i.e. a message which has a translation that needs review by the translator).
.obsolete
This matches an obsolete message (i.e. a message that was translated but is not needed by the current POT file any more).
white-space # translator-comments #. extracted-comments #: reference… #, flag… #| msgid previous-untranslated-string msgid untranslated-string msgstr translated-string |
.comment
This matches all comments (translator comments, extracted comments, source file reference comments, flag comments, previous message comments, as well as the entire obsolete messages).
.translator-comment
This matches the translator comments.
.extracted-comment
This matches the extracted comments, i.e. the comments placed by the programmer at the attention of the translator.
.reference-comment
This matches the source file reference comments (entire lines).
.reference
This matches the individual source file references inside the source file reference comment lines.
.flag-comment
This matches the flag comment lines (entire lines).
.flag
This matches the individual flags inside flag comment lines.
.fuzzy-flag
This matches the `fuzzy' flag inside flag comment lines.
.previous-comment
This matches the comments containing the previous untranslated string (entire lines).
.previous
This matches the previous untranslated string including the string
delimiters, the associated keywords (msgid
etc.) and the spaces
between them.
.msgid
This matches the untranslated string including the string delimiters, the
associated keywords (msgid
etc.) and the spaces between them.
.msgstr
This matches the translated string including the string delimiters, the
associated keywords (msgstr
etc.) and the spaces between them.
.keyword
This matches the keywords (msgid
, msgstr
, etc.).
.string
This matches strings, including the string delimiters (double quotes).
.text
This matches the entire contents of a string (excluding the string delimiters, i.e. the double quotes).
.escape-sequence
This matches an escape sequence (starting with a backslash).
.format-directive
This matches a format string directive (starting with a ‘%’ sign in the
case of most programming languages, with a ‘{’ in the case of
java-format
and csharp-format
, with a ‘~’ in the case of
lisp-format
and scheme-format
, or with ‘$’ in the case of
sh-format
).
.invalid-format-directive
This matches an invalid format string directive.
.added
In an untranslated string, this matches a part of the string that was not present in the previous untranslated string. (Not yet implemented in this release.)
.changed
In an untranslated string or in a previous untranslated string, this matches a part of the string that is changed or replaced. (Not yet implemented in this release.)
.removed
In a previous untranslated string, this matches a part of the string that is not present in the current untranslated string. (Not yet implemented in this release.)
These selectors can be combined to hierarchical selectors. For example,
.msgstr .invalid-format-directive { color: red; } |
will highlight the invalid format directives in the translated strings.
In text mode, pseudo-classes (CSS2 spec, section 5.11) and pseudo-elements (CSS2 spec, section 5.12) are not supported.
The declarations in HTML mode are not limited; any graphical attribute supported by the browsers can be used.
The declarations in text mode are limited to the following properties. Other properties will be silently ignored.
color
(CSS2 spec, section 14.1)background-color
(CSS2 spec, section 14.2.1)These properties is supported. Colors will be adjusted to match the terminal's capabilities. Note that many terminals support only 8 colors.
font-weight
(CSS2 spec, section 15.2.3)This property is supported, but most terminals can only render two different
weights: normal
and bold
. Values >= 600 are rendered as
bold
.
font-style
(CSS2 spec, section 15.2.3)This property is supported. The values italic
and oblique
are
rendered the same way.
text-decoration
(CSS2 spec, section 16.3.1)This property is supported, limited to the values none
and
underline
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
less
pour visualiser les fichiers PO The ‘less’ program is a popular text file browser for use in a text screen or terminal emulator. It also supports text with embedded escape sequences for colors and text decorations.
You can use less
to view a PO file like this (assuming an UTF-8
environment):
msgcat --to-code=UTF-8 --color xyz.po | less -R |
You can simplify this to this simple command:
less xyz.po |
after these three preparations:
LESS
environment
variable. In sh shells:
$ LESS="$LESS -R -f" $ export LESS |
LESSOPEN
and
LESSCLOSE
environment variables, as indicated in the manual page
(‘man less’).
msgcat
on them, producing a
temporary file. Like this:
case "$1" in *.po) tmpfile=`mktemp "${TMPDIR-/tmp}/less.XXXXXX"` msgcat --to-code=UTF-8 --color "$1" > "$tmpfile" echo "$tmpfile" exit 0 ;; esac |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
For the tasks for which a combination of ‘msgattrib’, ‘msgcat’ etc. is not sufficient, a set of C functions is provided in a library, to make it possible to process PO files in your own programs. When you use this library, you don't need to write routines to parse the PO file; instead, you retrieve a pointer in memory to each of messages contained in the PO file. Functions for writing PO files are not provided at this time.
The functions are declared in the header file ‘<gettext-po.h>’, and are defined in a library called ‘libgettextpo’.
This is a pointer type that refers to the contents of a PO file, after it has been read into memory.
This is a pointer type that refers to an iterator that produces a sequence of messages.
This is a pointer type that refers to a message of a PO file, including its translation.
The po_file_read
function reads a PO file into memory. The file name
is given as argument. The return value is a handle to the PO file's
contents, valid until po_file_free
is called on it. In case of
error, the return value is NULL
, and errno
is set.
The po_file_free
function frees a PO file's contents from memory,
including all messages that are only implicitly accessible through
iterators.
The po_file_domains
function returns the domains for which the given
PO file has messages. The return value is a NULL
terminated array
which is valid as long as the file handle is valid. For PO files
which contain no ‘domain’ directive, the return value contains only one
domain, namely the default domain "messages"
.
The po_message_iterator
returns an iterator that will produce the
messages of file that belong to the given domain. If
domain is NULL
, the default domain is used instead. To list
the messages, use the function po_next_message
repeatedly.
The po_message_iterator_free
function frees an iterator previously
allocated through the po_message_iterator
function.
The po_next_message
function returns the next message from
iterator and advances the iterator. It returns NULL
when the
iterator has reached the end of its message list.
The following functions returns details of a po_message_t
. Recall
that the results are valid as long as the file handle is valid.
The po_message_msgid
function returns the msgid
(untranslated
English string) of a message. This is guaranteed to be non-NULL
.
The po_message_msgid_plural
function returns the msgid_plural
(untranslated English plural string) of a message with plurals, or
NULL
for a message without plural.
The po_message_msgstr
function returns the msgstr
(translation) of a message. For an untranslated message, the return value
is an empty string.
The po_message_msgstr_plural
function returns the
msgstr[index]
of a message with plurals, or NULL
when
the index is out of range or for a message without plural.
Here is an example code how these functions can be used.
const char *filename = …; po_file_t file = po_file_read (filename); if (file == NULL) error (EXIT_FAILURE, errno, "couldn't open the PO file %s", filename); { const char * const *domains = po_file_domains (file); const char * const *domainp; for (domainp = domains; *domainp; domainp++) { const char *domain = *domainp; po_message_iterator_t iterator = po_message_iterator (file, domain); for (;;) { po_message_t *message = po_next_message (iterator); if (message == NULL) break; { const char *msgid = po_message_msgid (message); const char *msgstr = po_message_msgstr (message); … } } po_message_iterator_free (iterator); } } po_file_free (file); |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
10.1 Invocation du programme msgfmt | ||
10.2 Invocation du programme msgunfmt | ||
10.3 Le format des fichiers GNU MO |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgfmt
msgfmt [option] filename.po … |
The msgfmt
programs generates a binary message catalog from a textual
translation description.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If an input file is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Java mode: generate a Java ResourceBundle
class.
Like –java, and assume Java2 (JDK 1.2 or higher).
C# mode: generate a .NET .dll file containing a subclass of
GettextResourceSet
.
C# resources mode: generate a .NET ‘.resources’ file.
Tcl mode: generate a tcl/msgcat ‘.msg’ file.
Qt mode: generate a Qt ‘.qm’ file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
Direct the program to work strictly following the Uniforum/Sun implementation. Currently this only affects the naming of the output file. If this option is not given the name of the output file is the same as the domain name. If the strict Uniforum mode is enabled the suffix ‘.mo’ is added to the file name if it is not already present.
We find this behaviour of Sun's implementation rather silly and so by default this mode is not selected.
If the output file is ‘-’, output is written to standard output.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify the resource name.
Specify the locale name, either a language specification of the form ll or a combined language and country specification of the form ll_CC.
Specify the base directory of classes directory hierarchy.
The class name is determined by appending the locale name to the resource name, separated with an underscore. The ‘-d’ option is mandatory. The class is written under the specified directory.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify the resource name.
Specify the locale name, either a language specification of the form ll or a combined language and country specification of the form ll_CC.
Specify the base directory for locale dependent ‘.dll’ files.
The ‘-l’ and ‘-d’ options are mandatory. The ‘.dll’ file is written in a subdirectory of the specified directory whose name depends on the locale.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify the locale name, either a language specification of the form ll or a combined language and country specification of the form ll_CC.
Specify the base directory of ‘.msg’ message catalogs.
The ‘-l’ and ‘-d’ options are mandatory. The ‘.msg’ file is written in the specified directory.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Assume the input files are Java ResourceBundles in Java .properties
syntax, not in PO file syntax.
Assume the input files are NeXTstep/GNUstep localized resource files in
.strings
syntax, not in PO file syntax.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Perform all the checks implied by --check-format
,
--check-header
, --check-domain
.
Check language dependent format strings.
If the string represents a format string used in a printf
-like
function both strings should have the same number of ‘%’ format
specifiers, with matching types. If the flag c-format
or
possible-c-format
appears in the special comment <#,> for this
entry a check is performed. For example, the check will diagnose using
‘%.*s’ against ‘%s’, or ‘%d’ against ‘%s’, or ‘%d’
against ‘%x’. It can even handle positional parameters.
Normally the xgettext
program automatically decides whether a string
is a format string or not. This algorithm is not perfect, though. It might
regard a string as a format string though it is not used in a
printf
-like function and so msgfmt
might report errors where
there are none.
To solve this problem the programmer can dictate the decision to the
xgettext
program (voir la section Format des chaînes en C). The translator should not
consider removing the flag from the <#,> line. This "fix" would be
reversed again as soon as msgmerge
is called the next time.
Verify presence and contents of the header entry. @xref{Header Entry}, for a description of the various fields in the header entry.
Check for conflicts between domain directives and the --output-file
option
Check that GNU msgfmt behaves like X/Open msgfmt. This will give an error when attempting to use the GNU extensions.
Check presence of keyboard accelerators for menu items. This is based on the convention used in some GUIs that a keyboard accelerator in a menu item string is designated by an immediately preceding ‘&’ character. Sometimes a keyboard accelerator is also called "keyboard mnemonic". This check verifies that if the untranslated string has exactly one ‘&’ character, the translated string has exactly one ‘&’ as well. If this option is given with a char argument, this char should be a non-alphanumeric character and is used as keyboard accelerator mark instead of ‘&’.
Use fuzzy entries in output. Note that using this option is usually wrong, because fuzzy messages are exactly those which have not been validated by a human translator.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Align strings to number bytes (default: 1).
Don't include a hash table in the binary file. Lookup will be more expensive at run time (binary search instead of hash table lookup).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
Print statistics about translations.
Increase verbosity level.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
msgunfmt
msgunfmt [option] [file]... |
The msgunfmt
program converts a binary message catalog to a Uniforum
style .po file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Java mode: input is a Java ResourceBundle
class.
C# mode: input is a .NET .dll file containing a subclass of
GettextResourceSet
.
C# resources mode: input is a .NET ‘.resources’ file.
Tcl mode: input is a tcl/msgcat ‘.msg’ file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Input .mo files.
If no input file is given or if it is ‘-’, standard input is read.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify the resource name.
Specify the locale name, either a language specification of the form ll or a combined language and country specification of the form ll_CC.
The class name is determined by appending the locale name to the resource
name, separated with an underscore. The class is located using the
CLASSPATH
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify the resource name.
Specify the locale name, either a language specification of the form ll or a combined language and country specification of the form ll_CC.
Specify the base directory for locale dependent ‘.dll’ files.
The ‘-l’ and ‘-d’ options are mandatory. The ‘.msg’ file is located in a subdirectory of the specified directory whose name depends on the locale.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Specify the locale name, either a language specification of the form ll or a combined language and country specification of the form ll_CC.
Specify the base directory of ‘.msg’ message catalogs.
The ‘-l’ and ‘-d’ options are mandatory. The ‘.msg’ file is located in the specified directory.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Always write an output file even if it contains no message.
Write the .po file using indented style.
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties
syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings
syntax. Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
Increase verbosity level.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The format of the generated MO files is best described by a picture, which appears below.
The first two words serve the identification of the file. The magic number
will always signal GNU MO files. The number is stored in the byte order of
the generating machine, so the magic number really is two numbers:
0x950412de
and 0xde120495
. The second word describes the
current revision of the file format. For now the revision is 0. This might
change in future versions, and ensures that the readers of MO files can
distinguish new formats from old ones, so that both can be handled
correctly. The version is kept separate from the magic number, instead of
using different magic numbers for different formats, mainly because
‘/etc/magic’ is not updated often. It might be better to have magic
separated from internal format version identification.
Follow a number of pointers to later tables in the file, allowing for the extension of the prefix part of MO files without having to recompile programs reading them. This might become useful for later inserting a few flag bits, indication about the charset used, new tables, or other things.
Then, at offset O and offset T in the picture, two tables of string descriptors can be found. In both tables, each string descriptor uses two 32 bits integers, one for the string length, another for the offset of the string in the MO file, counting in bytes from the start of the file. The first table contains descriptors for the original strings, and is sorted so the original strings are in increasing lexicographical order. The second table contains descriptors for the translated strings, and is parallel to the first table: to find the corresponding translation one has to access the array slot in the second array with the same index.
Having the original strings sorted enables the use of simple binary search,
for when the MO file does not contain an hashing table, or for when it is
not practical to use the hashing table provided in the MO file. This also
has another advantage, as the empty string in a PO file GNU gettext
is usually translated into some system information attached to that
particular MO file, and the empty string necessarily becomes the first in
both the original and translated tables, making the system information very
easy to find.
The size S of the hash table can be zero. In this case, the hash
table itself is not contained in the MO file. Some people might prefer this
because a precomputed hashing table takes disk space, and does not win
that much speed. The hash table contains indices to the sorted array
of strings in the MO file. Conflict resolution is done by double hashing.
The precise hashing algorithm used is fairly dependent on GNU gettext
code, and is not documented here.
As for the strings themselves, they follow the hash file, and each is
terminated with a <NUL>, and this <NUL> is not counted in the length
which appears in the string descriptor. The msgfmt
program has an
option selecting the alignment for MO file strings. With this option, each
string is separately aligned so it starts at an offset which is a multiple
of the alignment value. On some RISC machines, a correct alignment will
speed things up.
Contexts are stored by storing the concatenation of the context, a <EOT> byte, and the original string, instead of the original string.
Plural forms are stored by letting the plural of the original string follow the singular of the original string, separated through a <NUL> byte. The length which appears in the string descriptor includes both. However, only the singular of the original string takes part in the hash table lookup. The plural variants of the translation are all stored consecutively, separated through a <NUL> byte. Here also, the length in the string descriptor includes all of them.
Nothing prevents a MO file from having embedded <NUL>s in strings. However, the program interface currently used already presumes that strings are <NUL> terminated, so embedded <NUL>s are somewhat useless. But the MO file format is general enough so other interfaces would be later possible, if for example, we ever want to implement wide characters right in MO files, where <NUL> bytes may accidentally appear. (No, we don't want to have wide characters in MO files. They would make the file unnecessarily large, and the ‘wchar_t’ type being platform dependent, MO files would be platform dependent as well.)
This particular issue has been strongly debated in the GNU gettext
development forum, and it is expectable that MO file format will evolve or
change over time. It is even possible that many formats may later be
supported concurrently. But surely, we have to start somewhere, and the MO
file format described here is a good start. Nothing is cast in concrete,
and the format may later evolve fairly easily, so we should feel comfortable
with the current approach.
byte +------------------------------------------+ 0 | magic number = 0x950412de | | | 4 | file format revision = 0 | | | 8 | number of strings | == N | | 12 | offset of table with original strings | == O | | 16 | offset of table with translation strings | == T | | 20 | size of hashing table | == S | | 24 | offset of hashing table | == H | | . . . (possibly more entries later) . . . | | O | length & offset 0th string ----------------. O + 8 | length & offset 1st string ------------------. ... ... | | O + ((N-1)*8)| length & offset (N-1)th string | | | | | | | T | length & offset 0th translation ---------------. T + 8 | length & offset 1st translation -----------------. ... ... | | | | T + ((N-1)*8)| length & offset (N-1)th translation | | | | | | | | | | | H | start hash table | | | | | ... ... | | | | H + S * 4 | end hash table | | | | | | | | | | | | NUL terminated 0th string <----------------' | | | | | | | | | NUL terminated 1st string <------------------' | | | | | | ... ... | | | | | | | NUL terminated 0th translation <---------------' | | | | | NUL terminated 1st translation <-----------------' | | ... ... | | +------------------------------------------+ |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
One aim of the current message catalog implementation provided by GNU
gettext
was to use the system's message catalog handling, if the
installer wishes to do so. So we perhaps should first take a look at the
solutions we know about. The people in the POSIX committee did not manage
to agree on one of the semi-official standards which we'll describe below.
In fact they couldn't agree on anything, so they decided only to include an
example of an interface. The major Unix vendors are split in the usage of
the two most important specifications: X/Open's catgets vs. Uniforum's
gettext interface. We'll describe them both and later explain our solution
of this dilemma.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
catgets
The catgets
implementation is defined in the X/Open Portability
Guide, Volume 3, XSI Supplementary Definitions, Chapter 5. But the process
of creating this standard seemed to be too slow for some of the Unix vendors
so they created their implementations on preliminary versions of the
standard. Of course this leads again to problems while writing platform
independent programs: even the usage of catgets
does not guarantee a
unique interface.
Another, personal comment on this that only a bunch of committee members could have made this interface. They never really tried to program using this interface. It is a fast, memory-saving implementation, an user can happily live with it. But programmers hate it (at least I and some others do…)
But we must not forget one point: after all the trouble with transferring the rights on Unix(tm) they at last came to X/Open, the very same who published this specification. This leads me to making the prediction that this interface will be in future Unix standards (e.g. Spec1170) and therefore part of all Unix implementation (implementations, which are allowed to wear this name).
11.1.1 The Interface | L'interface | |
11.1.2 Problems with the catgets Interface?! | Problèmes avec l'interface de catgets ?
|
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The interface to the catgets
implementation consists of three
functions which correspond to those used in file access: catopen
to
open the catalog for using, catgets
for accessing the message tables,
and catclose
for closing after work is done. Prototypes for the
functions and the needed definitions are in the <nl_types.h>
header
file.
catopen
is used like in this:
nl_catd catd = catopen ("catalog_name", 0); |
The function takes as the argument the name of the catalog. This usual
refers to the name of the program or the package. The second parameter is
not further specified in the standard. I don't even know whether it is
implemented consistently among various systems. So the common advice is to
use 0
as the value. The return value is a handle to the message
catalog, equivalent to handles to file returned by open
.
This handle is of course used in the catgets
function which can be
used like this:
char *translation = catgets (catd, set_no, msg_id, "original string"); |
The first parameter is this catalog descriptor. The second parameter
specifies the set of messages in this catalog, in which the message
described by msg_id
is obtained. catgets
therefore uses a
three-stage addressing:
catalog name ⇒ set number ⇒ message ID ⇒ translation |
The fourth argument is not used to address the translation. It is given as
a default value in case when one of the addressing stages fail. One
important thing to remember is that although the return type of catgets is
char *
the resulting string must not be changed. It should
better be const char *
, but the standard is published in 1988, one
year before ANSI C.
The last of these functions is used and behaves as expected:
catclose (catd); |
After this no catgets
call using the descriptor is legal anymore.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
catgets
Interface?! Now that this description seemed to be really easy — where are the
problems we speak of? In fact the interface could be used in a reasonable
way, but constructing the message catalogs is a pain. The reason for this
lies in the third argument of catgets
: the unique message ID. This
has to be a numeric value for all messages in a single set. Perhaps you
could imagine the problems keeping such a list while changing the source
code. Add a new message here, remove one there. Of course there have been
developed a lot of tools helping to organize this chaos but one as the other
fails in one aspect or the other. We don't want to say that the other
approach has no problems but they are far more easy to manage.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
The definition of the gettext
interface comes from a Uniforum
proposal. It was submitted there by Sun, who had implemented the
gettext
function in SunOS 4, around 1990. Nowadays, the
gettext
interface is specified by the OpenI18N standard.
The main point about this solution is that it does not follow the method of normal file handling (open-use-close) and that it does not burden the programmer with so many tasks, especially the unique key handling. Of course here also a unique key is needed, but this key is the message itself (how long or short it is). See @ref{Comparison} for a more detailed comparison of the two methods.
The following section contains a rather detailed description of the
interface. We make it that detailed because this is the interface we chose
for the GNU gettext
Library. Programmers interested in using this
library will be interested in this description.
11.2.1 The Interface | L'interface | |
11.2.2 Solving Ambiguities | Résoudre les ambiguïtés | |
11.2.3 Locating Message Catalog Files | Fichiers catalogues de message de localisation | |
11.2.4 How to specify the output character set gettext uses | Comme demander une conversion en Unicode | |
11.2.5 Using contexts for solving ambiguities | Résoudre les ambiguïtés dans les programmes avec un interface utilisateurgraphique | |
11.2.6 Additional functions for plural forms | Fonctions additionnelles pour gérer les pluriels | |
11.2.7 Optimisation des fonctions *gettext |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The minimal functionality an interface must have is a) to select a domain the strings are coming from (a single domain for all programs is not reasonable because its construction and maintenance is difficult, perhaps impossible) and b) to access a string in a selected domain.
This is principally the description of the gettext
interface. It has
a global domain which unqualified usages reference. Of course this domain
is selectable by the user.
char *textdomain (const char *domain_name); |
This provides the possibility to change or query the current status of the
current global domain of the LC_MESSAGE
category. The argument is a
null-terminated string, whose characters must be legal in the use in
filenames. If the domain_name argument is NULL
, the function
returns the current value. If no value has been set before, the name of the
default domain is returned: messages. Please note that although the
return value of textdomain
is of type char *
no changing is
allowed. It is also important to know that no checks of the availability
are made. If the name is not available you will see this by the fact that
no translations are provided.
To use a domain set by textdomain
the function
char *gettext (const char *msgid); |
is to be used. This is the simplest reasonable form one can imagine. The
translation of the string msgid is returned if it is available in the
current domain. If it is not available, the argument itself is returned.
If the argument is NULL
the result is undefined.
One thing which should come into mind is that no explicit dependency to the
used domain is given. The current value of the domain is used. If this
changes between two executions of the same gettext
call in the
program, both calls reference a different message catalog.
For the easiest case, which is normally used in internationalized packages,
once at the beginning of execution a call to textdomain
is issued,
setting the domain to a unique name, normally the package name. In the
following code all strings which have to be translated are filtered through
the gettext function. That's all, the package speaks your language.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
While this single name domain works well for most applications there might
be the need to get translations from more than one domain. Of course one
could switch between different domains with calls to textdomain
, but
this is really not convenient nor is it fast. A possible situation could be
one case subject to discussion during this writing: all error messages of
functions in the set of common used functions should go into a separate
domain error
. By this mean we would only need to translate them
once. Another case are messages from a library, as these have to be
independent of the current domain set by the application.
For this reasons there are two more functions to retrieve strings:
char *dgettext (const char *domain_name, const char *msgid); char *dcgettext (const char *domain_name, const char *msgid, int category); |
Both take an additional argument at the first place, which corresponds to
the argument of textdomain
. The third argument of dcgettext
allows to use another locale category but LC_MESSAGES
. But I really
don't know where this can be useful. If the domain_name is
NULL
or category has an value beside the known ones, the result
is undefined. It should also be noted that this function is not part of the
second known implementation of this function family, the one found in
Solaris.
A second ambiguity can arise by the fact, that perhaps more than one domain has the same name. This can be solved by specifying where the needed message catalog files can be found.
char *bindtextdomain (const char *domain_name, const char *dir_name); |
Calling this function binds the given domain to a file in the specified
directory (how this file is determined follows below). Especially a file in
the systems default place is not favored against the specified file anymore
(as it would be by solely using textdomain
). A NULL
pointer
for the dir_name parameter returns the binding associated with
domain_name. If domain_name itself is NULL
nothing
happens and a NULL
pointer is returned. Here again as for all the
other functions is true that none of the return value must be changed!
It is important to remember that relative path names for the dir_name
parameter can be trouble. Since the path is always computed relative to the
current directory different results will be achieved when the program
executes a chdir
command. Relative paths should always be avoided to
avoid dependencies and unreliabilities.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Because many different languages for many different packages have to be
stored we need some way to add these information to file message catalog
files. The way usually used in Unix environments is have this encoding in
the file name. This is also done here. The directory name given in
bindtextdomain
s second argument (or the default directory), followed
by the name of the locale, the locale category, and the domain name are
concatenated:
dir_name/locale/LC_category/domain_name.mo |
The default value for dir_name is system specific. For the GNU library, and for packages adhering to its conventions, it's:
/usr/local/share/locale |
locale is the name of the locale category which is designated by
LC_category
. For gettext
and dgettext
this
LC_category
is always LC_MESSAGES
.(3) The name of
the locale category is determined through setlocale
(LC_category, NULL)
. (4) When using the function
dcgettext
, you can specify the locale category through the third
argument.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
uses gettext
not only looks up a translation in a message catalog. It
also converts the translation on the fly to the desired output character
set. This is useful if the user is working in a different character set
than the translator who created the message catalog, because it avoids
distributing variants of message catalogs which differ only in the character
set.
The output character set is, by default, the value of nl_langinfo
(CODESET)
, which depends on the LC_CTYPE
part of the current
locale. But programs which store strings in a locale independent way
(e.g. UTF-8) can request that gettext
and related functions return
the translations in that encoding, by use of the
bind_textdomain_codeset
function.
Note that the msgid argument to gettext
is not subject to
character set conversion. Also, when gettext
does not find a
translation for msgid, it returns msgid unchanged –
independently of the current output character set. It is therefore
recommended that all msgids be US-ASCII strings.
The bind_textdomain_codeset
function can be used to specify the
output character set for message catalogs for domain domainname. The
codeset argument must be a valid codeset name which can be used for
the iconv_open
function, or a null pointer.
If the codeset parameter is the null pointer,
bind_textdomain_codeset
returns the currently selected codeset for
the domain with the name domainname. It returns NULL
if no
codeset has yet been selected.
The bind_textdomain_codeset
function can be used several times. If
used multiple times with the same domainname argument, the later call
overrides the settings made by the earlier one.
The bind_textdomain_codeset
function returns a pointer to a string
containing the name of the selected codeset. The string is allocated
internally in the function and must not be changed by the user. If the
system went out of core during the execution of
bind_textdomain_codeset
, the return value is NULL
and the
global variable errno is set accordingly.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
One place where the gettext
functions, if used normally, have big
problems is within programs with graphical user interfaces (GUIs). The
problem is that many of the strings which have to be translated are very
short. They have to appear in pull-down menus which restricts the length.
But strings which are not containing entire sentences or at least large
fragments of a sentence may appear in more than one situation in the program
but might have different translations. This is especially true for the
one-word strings which are frequently used in GUI programs.
As a consequence many people say that the gettext
approach is wrong
and instead catgets
should be used which indeed does not have this
problem. But there is a very simple and powerful method to handle this kind
of problems with the gettext
functions.
Contexts can be added to strings to be translated. A context dependent translation lookup is when a translation for a given string is searched, that is limited to a given context. The translation for the same string in a different context can be different. The different translations of the same string in different contexts can be stored in the in the same MO file, and can be edited by the translator in the same PO file.
The ‘gettext.h’ include file contains the lookup macros for strings
with contexts. They are implemented as thin macros and inline functions
over the functions from <libintl.h>
.
const char *pgettext (const char *msgctxt, const char *msgid); |
In a call of this macro, msgctxt and msgid must be string literals. The macro returns the translation of msgid, restricted to the context given by msgctxt.
The msgctxt string is visible in the PO file to the translator. You should try to make it somehow canonical and never changing. Because every time you change an msgctxt, the translator will have to review the translation of msgid.
Finding a canonical msgctxt string that doesn't change over time can
be hard. But you shouldn't use the file name or class name containing the
pgettext
call – because it is a common development task to rename a
file or a class, and it shouldn't cause translator work. Also you shouldn't
use a comment in the form of a complete English sentence as msgctxt –
because orthography or grammar changes are often applied to such sentences,
and again, it shouldn't force the translator to do a review.
The ‘p’ in ‘pgettext’ stands for “particular”: pgettext
fetches a particular translation of the msgid.
const char *dpgettext (const char *domain_name, const char *msgctxt, const char *msgid); const char *dcpgettext (const char *domain_name, const char *msgctxt, const char *msgid, int category); |
These are generalizations of pgettext
. They behave similarly to
dgettext
and dcgettext
, respectively. The domain_name
argument defines the translation domain. The category argument allows
to use another locale category than LC_MESSAGES
.
As as example consider the following fictional situation. A GUI program has a menu bar with the following entries:
+------------+------------+--------------------------------------+ | File | Printer | | +------------+------------+--------------------------------------+ | Open | | Select | | New | | Open | +----------+ | Connect | +----------+ |
To have the strings File
, Printer
, Open
, New
,
Select
, and Connect
translated there has to be at some point
in the code a call to a function of the gettext
family. But in two
places the string passed into the function would be Open
. The
translations might not be the same and therefore we are in the dilemma
described above.
What distinguishes the two places is the menu path from the menu root to the particular menu entries:
Menu|File Menu|Printer Menu|File|Open Menu|File|New Menu|Printer|Select Menu|Printer|Open Menu|Printer|Connect |
The context is thus the menu path without its last part. So, the calls look like this:
pgettext ("Menu|", "File") pgettext ("Menu|", "Printer") pgettext ("Menu|File|", "Open") pgettext ("Menu|File|", "New") pgettext ("Menu|Printer|", "Select") pgettext ("Menu|Printer|", "Open") pgettext ("Menu|Printer|", "Connect") |
Whether or not to use the ‘|’ character at the end of the context is a matter of style.
For more complex cases, where the msgctxt or msgid are not string literals, more general macros are available:
const char *pgettext_expr (const char *msgctxt, const char *msgid); const char *dpgettext_expr (const char *domain_name, const char *msgctxt, const char *msgid); const char *dcpgettext_expr (const char *domain_name, const char *msgctxt, const char *msgid, int category); |
Here msgctxt and msgid can be arbitrary string-valued expressions. These macros are more general. But in the case that both argument expressions are string literals, the macros without the ‘_expr’ suffix are more efficient.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The functions of the gettext
family described so far (and all the
catgets
functions as well) have one problem in the real world which
have been neglected completely in all existing approaches. What is meant
here is the handling of plural forms.
Looking through Unix source code before the time anybody thought about internationalization (and, sadly, even afterwards) one can often find code similar to the following:
printf ("%d file%s deleted", n, n == 1 ? "" : "s"); |
After the first complaints from people internationalizing the code people
either completely avoided formulations like this or used strings like
"file(s)"
. Both look unnatural and should be avoided. First tries
to solve the problem correctly looked like this:
if (n == 1) printf ("%d file deleted", n); else printf ("%d files deleted", n); |
But this does not solve the problem. It helps languages where the plural
form of a noun is not simply constructed by adding an
‘s’
but that is all. Once again people fell into the trap of believing the
rules their language is using are universal. But the handling of plural
forms differs widely between the language families. For example, Rafal
Maszkowski <rzm@mat.uni.torun.pl>
reports:
In Polish we use e.g. plik (file) this way:
1 plik 2,3,4 pliki 5-21 pliko'w 22-24 pliki 25-31 pliko'wand so on (o' means 8859-2 oacute which should be rather okreska, similar to aogonek).
There are two things which can differ between languages (and even inside language families);
But other language families have only one form or many forms. More information on this in an extra section.
The consequence of this is that application writers should not try to solve
the problem in their code. This would be localization since it is only
usable for certain, hardcoded language environments. Instead the extended
gettext
interface should be used.
These extra functions are taking instead of the one key string two strings
and a numerical argument. The idea behind this is that using the numerical
argument and the first string as a key, the implementation can select using
rules specified by the translator the right plural form. The two string
arguments then will be used to provide a return value in case no message
catalog is found (similar to the normal gettext
behavior). In this
case the rules for Germanic language is used and it is assumed that the
first string argument is the singular form, the second the plural form.
This has the consequence that programs without language catalogs can display
the correct strings only if the program itself is written using a Germanic
language. This is a limitation but since the GNU C library (as well as the
GNU gettext
package) are written as part of the GNU package and the
coding standards for the GNU project require program being written in
English, this solution nevertheless fulfills its purpose.
The ngettext
function is similar to the gettext
function as it
finds the message catalogs in the same way. But it takes two extra
arguments. The msgid1 parameter must contain the singular form of the
string to be converted. It is also used as the key for the search in the
catalog. The msgid2 parameter is the plural form. The parameter
n is used to determine the plural form. If no message catalog is
found msgid1 is returned if n == 1
, otherwise msgid2
.
An example for the use of this function is:
printf (ngettext ("%d file removed", "%d files removed", n), n); |
Please note that the numeric value n has to be passed to the
printf
function as well. It is not sufficient to pass it only to
ngettext
.
In the English singular case, the number – always 1 – can be replaced with "one":
printf (ngettext ("One file removed", "%d files removed", n), n); |
This works because the ‘printf’ function discards excess arguments that are not consumed by the format string.
It is also possible to use this function when the strings don't contain a cardinal number:
puts (ngettext ("Delete the selected file?", "Delete the selected files?", n)); |
In this case the number n is only used to choose the plural form.
The dngettext
is similar to the dgettext
function in the way
the message catalog is selected. The difference is that it takes two extra
parameter to provide the correct plural form. These two parameters are
handled in the same way ngettext
handles them.
The dcngettext
is similar to the dcgettext
function in the way
the message catalog is selected. The difference is that it takes two extra
parameter to provide the correct plural form. These two parameters are
handled in the same way ngettext
handles them.
Now, how do these functions solve the problem of the plural forms? Without the input of linguists (which was not available) it was not possible to determine whether there are only a few different forms in which plural forms are formed or whether the number can increase with every new supported language.
Therefore the solution implemented is to allow the translator to specify the rules of how to select the plural form. Since the formula varies with every language this is the only viable solution except for hardcoding the information in the code (which still would require the possibility of extensions to not prevent the use of new languages).
header The information about the plural form selection has to be stored in
the header entry of the PO file (the one with the empty msgid
string). The plural form information looks like this:
Plural-Forms: nplurals=2; plural=n == 1 ? 0 : 1; |
The nplurals
value must be a decimal number which specifies how many
different plural forms exist for this language. The string following
plural
is an expression which is using the C language syntax.
Exceptions are that no negative numbers are allowed, numbers must be
decimal, and the only variable allowed is n
. Spaces are allowed in
the expression, but backslash-newlines are not; in the examples below the
backslash-newlines are present for formatting purposes only. This
expression will be evaluated whenever one of the functions ngettext
,
dngettext
, or dcngettext
is called. The numeric value passed
to these functions is then substituted for all uses of the variable n
in the expression. The resulting value then must be greater or equal to
zero and smaller than the value given as the value of nplurals
.
The following rules are known at this point. The language with families are listed. But this does not necessarily mean the information can be generalized for the whole family (as can be easily seen in the table below).(5)
Some languages only require one single form. There is no distinction between the singular and plural form. An appropriate header entry would look like this:
Plural-Forms: nplurals=1; plural=0; |
Languages with this property include:
Japanese, Korean, Vietnamese
Turkish
This is the form used in most existing programs since it is what English is using. A header entry would look like this:
Plural-Forms: nplurals=2; plural=n != 1; |
(Note: this uses the feature of C expressions that boolean expressions have to value zero or one.)
Languages with this property include:
Danish, Dutch, English, Faroese, German, Norwegian, Swedish
Estonian, Finnish
Greek
Hebrew
Italian, Portuguese, Spanish
Esperanto
Another language using the same header entry is:
Hungarian
Hungarian does not appear to have a plural if you look at sentences
involving cardinal numbers. For example, “1 apple” is “1 alma”, and
“123 apples” is “123 alma”. But when the number is not explicit, the
distinction between singular and plural exists: “the apple” is “az
alma”, and “the apples” is “az almák”. Since ngettext
has
to support both types of sentences, it is classified here, under “two
forms”.
Exceptional case in the language family. The header entry would be:
Plural-Forms: nplurals=2; plural=n>1; |
Languages with this property include:
French, Brazilian Portuguese
The header entry would be:
Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n != 0 ? 1 : 2; |
Languages with this property include:
Latvian
The header entry would be:
Plural-Forms: nplurals=3; plural=n==1 ? 0 : n==2 ? 1 : 2; |
Languages with this property include:
Gaeilge (Irish)
The header entry would be:
Plural-Forms: nplurals=3; \ plural=n==1 ? 0 : (n==0 || (n%100 > 0 && n%100 < 20)) ? 1 : 2; |
Languages with this property include:
Romanian
The header entry would look like this:
Plural-Forms: nplurals=3; \ plural=n%10==1 && n%100!=11 ? 0 : \ n%10>=2 && (n%100<10 || n%100>=20) ? 1 : 2; |
Languages with this property include:
Lithuanian
The header entry would look like this:
Plural-Forms: nplurals=3; \ plural=n%10==1 && n%100!=11 ? 0 : \ n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2; |
Languages with this property include:
Croatian, Serbian, Russian, Ukrainian
The header entry would look like this:
Plural-Forms: nplurals=3; \ plural=(n==1) ? 0 : (n>=2 && n<=4) ? 1 : 2; |
Languages with this property include:
Slovak, Czech
The header entry would look like this:
Plural-Forms: nplurals=3; \ plural=n==1 ? 0 : \ n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2; |
Languages with this property include:
Polish
The header entry would look like this:
Plural-Forms: nplurals=4; \ plural=n%100==1 ? 0 : n%100==2 ? 1 : n%100==3 || n%100==4 ? 2 : 3; |
Languages with this property include:
Slovenian
You might now ask, ngettext
handles only numbers n of type
‘unsigned long’. What about larger integer types? What about negative
numbers? What about floating-point numbers?
About larger integer types, such as ‘uintmax_t’ or ‘unsigned long
long’: they can be handled by reducing the value to a range that fits in an
‘unsigned long’. Simply casting the value to ‘unsigned long’
would not do the right thing, since it would treat ULONG_MAX + 1
like
zero, ULONG_MAX + 2
like singular, and the like. Here you can
exploit the fact that all mentioned plural form formulas eventually become
periodic, with a period that is a divisor of 100 (or 1000 or 1000000). So,
when you reduce a large value to another one in the range [1000000, 1999999]
that ends in the same 6 decimal digits, you can assume that it will lead to
the same plural form selection. This code does this:
#include <inttypes.h> uintmax_t nbytes = ...; printf (ngettext ("The file has %"PRIuMAX" byte.", "The file has %"PRIuMAX" bytes.", (nbytes > ULONG_MAX ? (nbytes % 1000000) + 1000000 : nbytes)), nbytes); |
Negative and floating-point values usually represent physical entities for
which singular and plural don't clearly apply. In such cases, there is no
need to use ngettext
; a simple gettext
call with a form
suitable for all values will do. For example:
printf (gettext ("Time elapsed: %.3f seconds"), num_milliseconds * 0.001); |
Even if num_milliseconds happens to be a multiple of 1000, the output
Time elapsed: 1.000 seconds |
is acceptable in English, and similarly for other languages.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
At this point of the discussion we should talk about an advantage of the GNU
gettext
implementation. Some readers might have pointed out that an
internationalized program might have a poor performance if some string has
to be translated in an inner loop. While this is unavoidable when the
string varies from one run of the loop to the other it is simply a waste of
time when the string is always the same. Take the following example:
{ while (…) { puts (gettext ("Hello world")); } } |
When the locale selection does not change between two runs the resulting string is always the same. One way to use this is:
{ str = gettext ("Hello world"); while (…) { puts (str); } } |
But this solution is not usable in all situation (e.g. when the locale selection changes) nor does it lead to legible code.
For this reason, GNU gettext
caches previous translation results.
When the same translation is requested twice, with no new message catalogs
being loaded in between, gettext
will, the second time, find the
result through a single cache lookup.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The following discussion is perhaps a little bit colored. As said above we
implemented GNU gettext
following the Uniforum proposal and this
surely has its reasons. But it should show how we came to this decision.
First we take a look at the developing process. When we write an
application using NLS provided by gettext
we proceed as always. Only
when we come to a string which might be seen by the users and thus has to be
translated we use gettext("…")
instead of "…"
. At
the beginning of each source file (or in a central header file) we define
#define gettext(String) (String) |
Even this definition can be avoided when the system supports the
gettext
function in its C library. When we compile this code the
result is the same as if no NLS code is used. When you take a look at the
GNU gettext
code you will see that we use _("…")
instead
of gettext("…")
. This reduces the number of additional
characters per translatable string to 3 (in words: three).
When now a production version of the program is needed we simply replace the definition
#define _(String) (String) |
by
#include <libintl.h> #define _(String) gettext (String) |
Additionally we run the program ‘xgettext’ on all source code file which contain translatable strings and that's it: we have a running program which does not depend on translations to be available, but which can use any that becomes available.
The same procedure can be done for the gettext_noop
invocations
(@pxref{Special cases}). One usually defines gettext_noop
as a no-op
macro. So you should consider the following code for your project:
#define gettext_noop(String) String #define N_(String) gettext_noop (String) |
N_
is a short form similar to _
. The ‘Makefile’ in the
‘po/’ directory of GNU gettext
knows by default both of the
mentioned short forms so you are invited to follow this proposal for your
own ease.
Now to catgets
. The main problem is the work for the programmer.
Every time he comes to a translatable string he has to define a number (or a
symbolic constant) which has also be defined in the message catalog file.
He also has to take care for duplicate entries, duplicate message IDs etc.
If he wants to have the same quality in the message catalog as the GNU
gettext
program provides he also has to put the descriptive comments
for the strings and the location in all source code files in the message
catalog. This is nearly a Mission: Impossible.
But there are also some points people might call advantages speaking for
catgets
. If you have a single word in a string and this string is
used in different contexts it is likely that in one or the other language
the word has different translations. Example:
printf ("%s: %d", gettext ("number"), number_of_errors) printf ("you should see %d %s", number_count, number_count == 1 ? gettext ("number") : gettext ("numbers")) |
Here we have to translate two times the string "number"
. Even if you
do not speak a language beside English it might be possible to recognize
that the two words have a different meaning. In German the first appearance
has to be translated to "Anzahl"
and the second to "Zahl"
.
Now you can say that this example is really esoteric. And you are right! This is exactly how we felt about this problem and decide that it does not weight that much. The solution for the above problem could be very easy:
printf ("%s %d", gettext ("number:"), number_of_errors) printf (number_count == 1 ? gettext ("you should see %d number") : gettext ("you should see %d numbers"), number_count) |
We believe that we can solve all conflicts with this method. If it is difficult one can also consider changing one of the conflicting string a little bit. But it is not impossible to overcome.
catgets
allows same original entry to have different translations,
but gettext
has another, scalable approach for solving ambiguities of
this kind: @xref{Ambiguities}.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Starting with version 0.9.4 the library libintl.h
should be
self-contained. I.e., you can use it in your own programs without providing
additional functions. The ‘Makefile’ will put the header and the
library in directories selected using the $(prefix)
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
NOTE: This documentation section is outdated and needs to be revised.
To fully exploit the functionality of the GNU gettext
library it is
surely helpful to read the source code. But for those who don't want to
spend that much time in reading the (sometimes complicated) code here is a
list comments:
For interactive programs it might be useful to offer a selection of the used
language at runtime. To understand how to do this one need to know how the
used language is determined while executing the gettext
function.
The method which is presented here only works correctly with the GNU
implementation of the gettext
functions.
In the function dcgettext
at every call the current setting of the
highest priority environment variable is determined and used. Highest
priority means here the following list with decreasing priority:
Afterwards the path is constructed using the found value and the translation file is loaded if available.
What happens now when the value for, say, LANGUAGE
changes? According
to the process explained above the new value of this variable is found as
soon as the dcgettext
function is called. But this also means the
(perhaps) different message catalog file is loaded. In other words: the
used language is changed.
But there is one little hook. The code for gcc-2.7.0 and up provides some
optimization. This optimization normally prevents the calling of the
dcgettext
function as long as no new catalog is loaded. But if
dcgettext
is not called the program also cannot find the
LANGUAGE
variable be changed (@pxref{Optimized gettext}). A solution
for this is very easy. Include the following code in the language switching
function.
/* Change language. */ setenv ("LANGUAGE", "fr", 1); /* Make change known. */ { extern int _nl_msg_cat_cntr; ++_nl_msg_cat_cntr; } |
The variable _nl_msg_cat_cntr
is defined in ‘loadmsgcat.c’. You
don't need to know what this is for. But it can be used to detect whether a
gettext
implementation is GNU gettext and not non-GNU system's native
gettext implementation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
11.6.1 Deux implémentations temporaires possibles | ||
11.6.2 Temporairement - Sur catgets | ||
11.6.3 Temporaire - Pourquoi une seule implémentation | ||
11.6.4 Temporaire - Notes |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
There are two competing methods for language independent messages: the
X/Open catgets
method, and the Uniforum gettext
method. The
catgets
method indexes messages by integers; the gettext
method indexes them by their English translations. The catgets
method has been around longer and is supported by more vendors. The
gettext
method is supported by Sun, and it has been heard that the
COSE multi-vendor initiative is supporting it. Neither method is a POSIX
standard; the POSIX.1 committee had a lot of disagreement in this area.
Neither one is in the POSIX standard. There was much disagreement in the
POSIX.1 committee about using the gettext
routines vs. catgets
(XPG). In the end the committee couldn't agree on anything, so no messaging
system was included as part of the standard. I believe the informative
annex of the standard includes the XPG3 messaging interfaces, “…as an
example of a messaging system that has been implemented…”
They were very careful not to say anywhere that you should use one set of interfaces over the other. For more on this topic please see the Programming for Internationalization FAQ.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
catgets
There have been a few discussions of late on the use of catgets
as a
base. I think it important to present both sides of the argument and hence
am opting to play devil's advocate for a little bit.
I'll not deny the fact that catgets
could have been designed a lot
better. It currently has quite a number of limitations and these have
already been pointed out.
However there is a great deal to be said for consistency and standardization. A common recurring problem when writing Unix software is the myriad portability problems across Unix platforms. It seems as if every Unix vendor had a look at the operating system and found parts they could improve upon. Undoubtedly, these modifications are probably innovative and solve real problems. However, software developers have a hard time keeping up with all these changes across so many platforms.
And this has prompted the Unix vendors to begin to standardize their systems. Hence the impetus for Spec1170. Every major Unix vendor has committed to supporting this standard and every Unix software developer waits with glee the day they can write software to this standard and simply recompile (without having to use autoconf) across different platforms.
As I understand it, Spec1170 is roughly based upon version 4 of the X/Open
Portability Guidelines (XPG4). Because catgets
and friends are
defined in XPG4, I'm led to believe that catgets
is a part of
Spec1170 and hence will become a standardized component of all Unix systems.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Now it seems kind of wasteful to me to have two different systems installed
for accessing message catalogs. If we do want to remedy catgets
deficiencies why don't we try to expand catgets
(in a compatible
manner) rather than implement an entirely new system. Otherwise, we'll end
up with two message catalog access systems installed with an operating
system - one set of routines for packages using GNU gettext
for their
internationalization, and another set of routines (catgets) for all other
software. Bloated?
Supposing another catalog access system is implemented. Which do we
recommend? At least for Linux, we need to attract as many software
developers as possible. Hence we need to make it as easy for them to port
their software as possible. Which means supporting catgets
. We will
be implementing the libintl
code within our libc
, but does
this mean we also have to incorporate another message catalog access scheme
within our libc
as well? And what about people who are going to be
using the libintl
+ non-catgets
routines. When they port
their software to other platforms, they're now going to have to include the
front-end (libintl
) code plus the back-end code (the
non-catgets
access routines) with their software instead of just
including the libintl
code with their software.
Message catalog support is however only the tip of the iceberg. What about
the data for the other locale categories? They also have a number of
deficiencies. Are we going to abandon them as well and develop another
duplicate set of routines (should libintl
expand beyond message
catalog support)?
Like many parts of Unix that can be improved upon, we're stuck with balancing compatibility with the past with useful improvements and innovations for the future.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
X/Open agreed very late on the standard form so that many implementations differ from the final form. Both of my system (old Linux catgets and Ultrix-4) have a strange variation.
OK. After incorporating the last changes I have to spend some time on
making the GNU/Linux libc
gettext
functions. So in future
Solaris is not the only system having gettext
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
12.1 Introduction 0 | ||
12.2 Introduction 1 | ||
12.3 Discussions | ||
12.4 Organisation | ||
12.5 Flux de l'information | ||
12.6 Prioritizing messages: How to determine which messages to translate first | Comment trouver quel message traduire en premier. |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
Free software is going international! The Translation Project is a way to get maintainers, translators and users all together, so free software will gradually become able to speak many native languages.
The GNU gettext
tool set contains everything maintainers need
for internationalizing their packages for messages. It also contains quite
useful tools for helping translators at localizing messages to their native
language, once a package has already been internationalized.
To achieve the Translation Project, we need many interested people who like their own language and write it well, and who are also able to synergize with other translators speaking the same language. If you'd like to volunteer to work at translating messages, please send mail to your translating team.
Each team has its own mailing list, courtesy of Linux International. You may reach your translating team at the address ‘ll@li.org’, replacing ll by the two-letter ISO 639 code for your language. Language codes are not the same as country codes given in ISO 3166. The following translating teams exist:
Chinese
zh
, Czechcs
, Danishda
, Dutchnl
, Esperantoeo
, Finnishfi
, Frenchfr
, Irishga
, Germande
, Greekel
, Italianit
, Japaneseja
, Indonesianin
, Norwegianno
, Polishpl
, Portuguesept
, Russianru
, Spanishes
, Swedishsv
and Turkishtr
.
For example, you may reach the Chinese translating team by writing to ‘zh@li.org’. When you become a member of the translating team for your own language, you may subscribe to its list. For example, Swedish people can send a message to ‘sv-request@li.org’, having this message body:
subscribe |
Keep in mind that team members should be interested in working at translations, or at solving translational difficulties, rather than merely lurking around. If your team does not exist yet and you want to start one, please write to ‘coordinator@translationproject.org’; you will then reach the coordinator for all translator teams.
A handful of GNU packages have already been adapted and provided with message translations for several languages. Translation teams have begun to organize, using these packages as a starting point. But there are many more packages and many languages for which we have no volunteer translators. If you would like to volunteer to work at translating messages, please send mail to ‘coordinator@translationproject.org’ indicating what language(s) you can work on.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
This is now official, GNU is going international! Here is the announcement submitted for the January 1995 GNU Bulletin:
A handful of GNU packages have already been adapted and provided with message translations for several languages. Translation teams have begun to organize, using these packages as a starting point. But there are many more packages and many languages for which we have no volunteer translators. If you'd like to volunteer to work at translating messages, please send mail to ‘coordinator@translationproject.org’ indicating what language(s) you can work on.
This document should answer many questions for those who are curious about the process or would like to contribute. Please at least skim over it, hoping to cut down a little of the high volume of e-mail generated by this collective effort towards internationalization of free software.
Most free programming which is widely shared is done in English, and currently, English is used as the main communicating language between national communities collaborating to free software. This very document is written in English. This will not change in the foreseeable future.
However, there is a strong appetite from national communities for having more software able to write using national language and habits, and there is an on-going effort to modify free software in such a way that it becomes able to do so. The experiments driven so far raised an enthusiastic response from pretesters, so we believe that internationalization of free software is dedicated to succeed.
For suggestion clarifications, additions or corrections to this document, please e-mail to ‘coordinator@translationproject.org’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
Facing this internationalization effort, a few users expressed their concerns. Some of these doubts are presented and discussed, here.
Some languages are not spoken by a very large number of people, so people speaking them sometimes consider that there may not be all that much demand such versions of free software packages. Moreover, many people being into computers, in some countries, generally seem to prefer English versions of their software.
On the other end, people might enjoy their own language a lot, and be very motivated at providing to themselves the pleasure of having their beloved free software speaking their mother tongue. They do themselves a personal favor, and do not pay that much attention to the number of people benefiting of their work.
Other users are shy to push forward their own language, seeing in this some kind of misplaced propaganda. Someone thought there must be some users of the language over the networks pestering other people with it.
But any spoken language is worth localization, because there are people behind the language for whom the language is important and dear to their hearts.
The biggest problem is to find the right translations so that everybody can understand the messages. Translations are usually a little odd. Some people get used to English, to the extent they may find translations into their own language “rather pushy, obnoxious and sometimes even hilarious.” As a French speaking man, I have the experience of those instruction manuals for goods, so poorly translated in French in Korea or Taiwan…
The fact is that we sometimes have to create a kind of national computer culture, and this is not easy without the collaboration of many people liking their mother tongue. This is why translations are better achieved by people knowing and loving their own language, and ready to work together at improving the results they obtain.
Some people wonder if using GNU gettext
necessarily brings their
package under the protective wing of the GNU General Public License or the
GNU Library General Public License, when they do not want to make their
program free, or want other kinds of freedom. The simplest answer is
“normally not”.
The gettext-runtime
part of GNU gettext
, i.e. the contents
of libintl
, is covered by the GNU Library General Public License.
The gettext-tools
part of GNU gettext
, i.e. the rest of the
GNU gettext
package, is covered by the GNU General Public License.
The mere marking of localizable strings in a package, or conditional
inclusion of a few lines for initialization, is not really including GPL'ed
or LGPL'ed code. However, since the localization routines in libintl
are under the LGPL, the LGPL needs to be considered. It gives the right to
distribute the complete unmodified source of libintl
even with
non-free programs. It also gives the right to use libintl
as a
shared library, even for non-free programs. But it gives the right to use
libintl
as a static library or to incorporate libintl
into
another library only to free software.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
On a larger scale, the true solution would be to organize some kind of fairly precise set up in which volunteers could participate. I gave some thought to this idea lately, and realize there will be some touchy points. I thought of writing to Richard Stallman to launch such a project, but feel it might be good to shake out the ideas between ourselves first. Most probably that Linux International has some experience in the field already, or would like to orchestrate the volunteer work, maybe. Food for thought, in any case!
I guess we have to setup something early, somehow, that will help many possible contributors of the same language to interlock and avoid work duplication, and further be put in contact for solving together problems particular to their tongue (in most languages, there are many difficulties peculiar to translating technical English). My Swedish contributor acknowledged these difficulties, and I'm well aware of them for French.
This is surely not a technical issue, but we should manage so the effort of locale contributors be maximally useful, despite the national team layer interface between contributors and maintainers.
The Translation Project needs some setup for coordinating language
coordinators. Localizing evolving programs will surely become a permanent
and continuous activity in the free software community, once well started.
The setup should be minimally completed and tested before GNU gettext
becomes an official reality. The e-mail address
‘coordinator@translationproject.org’ has been set up for receiving
offers from volunteers and general e-mail on these topics. This address
reaches the Translation Project coordinator.
12.4.1 Coordination contrale | ||
12.4.2 Équipes nationales | ||
12.4.3 Listes de diffusion |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
I also think GNU will need sooner than it thinks, that someone set up a way to organize and coordinate these groups. Some kind of group of groups. My opinion is that it would be good that GNU delegates this task to a small group of collaborating volunteers, shortly. Perhaps in ‘gnu.announce’ a list of this national committee's can be published.
My role as coordinator would simply be to refer to Ulrich any German speaking volunteer interested to localization of free software packages, and maybe helping national groups to initially organize, while maintaining national registries for until national groups are ready to take over. In fact, the coordinator should ease volunteers to get in contact with one another for creating national teams, which should then select one coordinator per language, or country (regionalized language). If well done, the coordination should be useful without being an overwhelming task, the time to put delegations in place.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
I suggest we look for volunteer coordinators/editors for individual languages. These people will scan contributions of translation files for various programs, for their own languages, and will ensure high and uniform standards of diction.
From my current experience with other people in these days, those who provide localizations are very enthusiastic about the process, and are more interested in the localization process than in the program they localize, and want to do many programs, not just one. This seems to confirm that having a coordinator/editor for each language is a good idea.
We need to choose someone who is good at writing clear and concise prose in the language in question. That is hard—we can't check it ourselves. So we need to ask a few people to judge each others' writing and select the one who is best.
I announce my prerelease to a few dozen people, and you would not believe all the discussions it generated already. I shudder to think what will happen when this will be launched, for true, officially, world wide. Who am I to arbitrate between two Czekolsovak users contradicting each other, for example?
I assume that your German is not much better than my French so that I would not be able to judge about these formulations. What I would suggest is that for each language there is a group for people who maintain the PO files and judge about changes. I suspect there will be cultural differences between how such groups of people will behave. Some will have relaxed ways, reach consensus easily, and have anyone of the group relate to the maintainers, while others will fight to death, organize heavy administrations up to national standards, and use strict channels.
The German team is putting out a good example. Right now, they are maybe half a dozen people revising translations of each other and discussing the linguistic issues. I do not even have all the names. Ulrich Drepper is taking care of coordinating the German team. He subscribed to all my pretest lists, so I do not even have to warn him specifically of incoming releases.
I'm sure, that is a good idea to get teams for each language working on translations. That will make the translations better and more consistent.
12.4.2.1 Cultures rattachées | ||
12.4.2.2 Idées d'organisation |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Taking French for example, there are a few sub-cultures around computers which developed diverging vocabularies. Picking volunteers here and there without addressing this problem in an organized way, soon in the project, might produce a distasteful mix of internationalized programs, and possibly trigger endless quarrels among those who really care.
Keeping some kind of unity in the way French localization of
internationalized programs is achieved is a difficult (and delicate) job.
Knowing the latin character of French people (:-), if we take this the wrong
way, we could end up nowhere, or spoil a lot of energies. Maybe we should
begin to address this problem seriously before GNU gettext
become officially published. And I suspect that this means soon!
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
I expect the next big changes after the official release. Please note that I use the German translation of the short GPL message. We need to set a few good examples before the localization goes out for true in the free software community. Here are a few points to discuss:
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
If we get any inquiries about GNU gettext
, send them on to:
‘coordinator@translationproject.org’ |
The ‘*-pretest’ lists are quite useful to me, maybe the idea could be generalized to many GNU, and non-GNU packages. But each maintainer his/her way!
François, we have a mechanism in place here at ‘gnu.ai.mit.edu’ to track teams, support mailing lists for them and log members. We have a slight preference that you use it. If this is OK with you, I can get you clued in.
Things are changing! A few years ago, when Daniel Fekete and I asked for a
mailing list for GNU localization, nested at the FSF, we were politely
invited to organize it anywhere else, and so did we. For communicating with
my pretesters, I later made a handful of mailing lists located at
iro.umontreal.ca and administrated by majordomo
. These lists have
been very dependable so far…
I suspect that the German team will organize itself a mailing list located in Germany, and so forth for other countries. But before they organize for true, it could surely be useful to offer mailing lists located at the FSF to each national team. So yes, please explain me how I should proceed to create and handle them.
We should create temporary mailing lists, one per country, to help people organize. Temporary, because once regrouped and structured, it would be fair the volunteers from country bring back their list in there and manage it as they want. My feeling is that, in the long run, each team should run its own list, from within their country. There also should be some central list to which all teams could subscribe as they see fit, as long as each team is represented in it.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
There will surely be some discussion about this messages after the packages are finally released. If people now send you some proposals for better messages, how do you proceed? Jim, please note that right now, as I put forward nearly a dozen of localizable programs, I receive both the translations and the coordination concerns about them.
If I put one of my things to pretest, Ulrich receives the announcement and passes it on to the German team, who make last minute revisions. Then he submits the translation files to me as the maintainer. For free packages I do not maintain, I would not even hear about it. This scheme could be made to work for the whole Translation Project, I think. For security reasons, maybe Ulrich (national coordinators, in fact) should update central registry kept at the Translation Project (Jim, me, or Len's recruits) once in a while.
In December/January, I was aggressively ready to internationalize all of GNU, giving myself the duty of one small GNU package per week or so, taking many weeks or months for bigger packages. But it does not work this way. I first did all the things I'm responsible for. I've nothing against some missionary work on other maintainers, but I'm also loosing a lot of energy over it—same debates over again.
And when the first localized packages are released we'll get a lot of responses about ugly translations :-). Surely, and we need to have beforehand a fairly good idea about how to handle the information flow between the national teams and the package maintainers.
Please start saving somewhere a quick history of each PO file. I know for sure that the file format will change, allowing for comments. It would be nice that each file has a kind of log, and references for those who want to submit comments or gripes, or otherwise contribute. I sent a proposal for a fast and flexible format, but it is not receiving acceptance yet by the GNU deciders. I'll tell you when I have more information about this.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
A translator sometimes has only a limited amount of time per week to spend on a package, and some packages have quite large message catalogs (over 1000 messages). Therefore she wishes to translate the messages first that are the most visible to the user, or that occur most frequently. This section describes how to determine these "most urgent" messages. It also applies to determine the "next most urgent" messages after the message catalog has already been partially translated.
In a first step, she uses the programs like a user would do. While she does
this, the GNU gettext
library logs into a file the not yet translated
messages for which a translation was requested from the program.
In a second step, she uses the PO mode to translate precisely this set of messages.
Here a more details. The GNU libintl
library (but not the
corresponding functions in GNU libc
) supports an environment variable
GETTEXT_LOG_UNTRANSLATED
. The GNU libintl
library will log
into this file the messages for which gettext()
and related functions
couldn't find the translation. If the file doesn't exist, it will be
created as needed. On systems with GNU libc
a shared library
‘preloadable_libintl.so’ is provided that can be used with the ELF
‘LD_PRELOAD’ mechanism.
So, in the first step, the translator uses these commands on systems with
GNU libc
:
$ LD_PRELOAD=/usr/local/lib/preloadable_libintl.so $ export LD_PRELOAD $ GETTEXT_LOG_UNTRANSLATED=$HOME/gettextlogused $ export GETTEXT_LOG_UNTRANSLATED |
and these commands on other systems:
$ GETTEXT_LOG_UNTRANSLATED=$HOME/gettextlogused $ export GETTEXT_LOG_UNTRANSLATED |
Then she uses and peruses the programs. (It is a good and recommended practice to use the programs for which you provide translations: it gives you the needed context.) When done, she removes the environment variables:
$ unset LD_PRELOAD $ unset GETTEXT_LOG_UNTRANSLATED |
The second step starts with removing duplicates:
$ msguniq $HOME/gettextlogused > missing.po |
The result is a PO file, but needs some preprocessing before a PO file editor can be used with it. First, it is a multi-domain PO file, containing messages from many translation domains. Second, it lacks all translator comments and source references. Here is how to get a list of the affected translation domains:
$ sed -n -e 's,^domain "\(.*\)"$,\1,p' < missing.po | sort | uniq |
Then the translator can handle the domains one by one. For simplicity, let's use environment variables to denote the language, domain and source package.
$ lang=nl # your language $ domain=coreutils # the name of the domain to be handled $ package=/usr/src/gnu/coreutils-4.5.4 # the package where it comes from |
She takes the latest copy of ‘$lang.po’ from the Translation Project, or from the package (in most cases, ‘$package/po/$lang.po’), or creates a fresh one if she's the first translator (see @ref{Creating}). She then uses the following commands to mark the not urgent messages as "obsolete". (This doesn't mean that these messages - translated and untranslated ones - will go away. It simply means that the PO file editor will ignore them in the following editing session.)
$ msggrep --domain=$domain missing.po | grep -v '^domain' \ > $domain-missing.po $ msgattrib --set-obsolete --ignore-file $domain-missing.po $domain.$lang.po \ > $domain.$lang-urgent.po |
The she translates ‘$domain.$lang-urgent.po’ by use of a PO file editor
(@pxref{Editing}). (FIXME: I don't know whether KBabel
and
gtranslator
also preserve obsolete messages, as they should.)
Finally she restores the not urgent messages (with their earlier
translations, for those which were already translated) through this command:
$ msgmerge --no-fuzzy-matching $domain.$lang-urgent.po $package/po/$domain.pot \ > $domain.$lang.po |
Then she can submit ‘$domain.$lang.po’ and proceed to the next domain.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The maintainer of a package has many responsibilities. One of them is ensuring that the package will install easily on many platforms, and that the magic we described earlier (@pxref{Users}) will work for installers and end users.
Of course, there are many possible ways by which GNU gettext
might be
integrated in a distribution, and this chapter does not cover them in all
generality. Instead, it details one possible approach which is especially
adequate for many free software distributions following GNU standards, or
even better, Gnits standards, because GNU gettext
is purposely for
helping the internationalization of the whole GNU project, and as many other
good free packages as possible. So, the maintainer's view presented here
presumes that the package already has a ‘configure.ac’ file and uses
GNU Autoconf.
Nevertheless, GNU gettext
may surely be useful for free packages not
following GNU standards and conventions, but the maintainers of such
packages might have to show imagination and initiative in organizing their
distributions so gettext
work for them in all situations. There are
surely many, out there.
Even if gettext
methods are now stabilizing, slight adjustments might
be needed between successive gettext
versions, so you should ideally
revise this chapter in subsequent releases, looking for changes.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Some free software packages are distributed as tar
files which unpack
in a single directory, these are said to be flat distributions. Other
free software packages have a one level hierarchy of subdirectories, using
for example a subdirectory named ‘doc/’ for the Texinfo manual and man
pages, another called ‘lib/’ for holding functions meant to replace or
complement C libraries, and a subdirectory ‘src/’ for holding the
proper sources for the package. These other distributions are said to be
non-flat.
We cannot say much about flat distributions. A flat directory structure has
the disadvantage of increasing the difficulty of updating to a new version
of GNU gettext
. Also, if you have many PO files, this could somewhat
pollute your single directory. Also, GNU gettext
's libintl sources
consist of C sources, shell scripts, sed
scripts and complicated
Makefile rules, which don't fit well into an existing flat structure. For
these reasons, we recommend to use non-flat approach in this case as well.
Maybe because GNU gettext
itself has a non-flat structure, we have
more experience with this approach, and this is what will be described in
the remaining of this chapter. Some maintainers might use this as an
opportunity to unflatten their package structure.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
There are some works which are required for using GNU gettext
in one
of your package. These works have some kind of generality that escape the
point by point descriptions used in the remainder of this chapter. So, we
describe them here.
gettextize
you should install some other
packages first. Ensure that recent versions of GNU m4
, GNU Autoconf
and GNU gettext
are already installed at your site, and if not,
proceed to do this first. If you get to install these things, beware that
GNU m4
must be fully installed before GNU Autoconf is even
configured.
To further ease the task of a package maintainer the automake
package
was designed and implemented. GNU gettext
now uses this tool and the
‘Makefile’s in the ‘intl/’ and ‘po/’ therefore know about all
the goals necessary for using automake
and ‘libintl’ in one
project.
Those four packages are only needed by you, as a maintainer; the installers
of your own package and end users do not really need any of GNU m4
,
GNU Autoconf, GNU gettext
, or GNU automake
for successfully
installing and running your package, with messages properly translated. But
this is not completely true if you provide internationalized shell scripts
within your own package: GNU gettext
shall then be installed at the
user site if the end users want to see the translation of shell script
messages.
It is worth adding here a few words about how the maintainer should ideally behave with PO files submissions. As a maintainer, your role is to authenticate the origin of the submission as being the representative of the appropriate translating teams of the Translation Project (forward the submission to ‘coordinator@translationproject.org’ in case of doubt), to ensure that the PO file format is not severely broken and does not prevent successful installation, and for the rest, to merely put these PO files in ‘po/’ for distribution.
As a maintainer, you do not have to take on your shoulders the responsibility of checking if the translations are adequate or complete, and should avoid diving into linguistic matters. Translation teams drive themselves and are fully responsible of their linguistic choices for the Translation Project. Keep in mind that translator teams are not driven by maintainers. You can help by carefully redirecting all communications and reports from users about linguistic matters to the appropriate translation team, or explain users how to reach or join their team. The simplest might be to send them the ‘ABOUT-NLS’ file.
Maintainers should never ever apply PO file bug reports themselves, short-cutting translation teams. If some translator has difficulty to get some of her points through her team, it should not be an option for her to directly negotiate translations with maintainers. Teams ought to settle their problems themselves, if any. If you, as a maintainer, ever think there is a real problem with a team, please never try to solve a team's problem on your own.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettextize
Le programme gettextize
est un outil interactif qui aide le
mainteneur d'un progiciel internationalisé à travers GNU gettext
. Il
est utilisé pour deux usages :
gettext
pour la première fois.
gettext
d'un progiciel depuis une version plus ancienne.
Ce programme réalise les tâches suivantes :
gettext
.
gettext
aux formes recommandées pour la version courante de
GNU gettext
."
gettextize
.
Il peut être invoqué de la manière suivante :
gettextize [ option… ] [ directory ] |
et accèpte les options suivantes :
Force replacement of files which already exist.
Install the libintl sources in a subdirectory named ‘intl/’. This
libintl will be used to provide internationalization on systems that don't
have GNU libintl installed. If this option is omitted, the call to
AM_GNU_GETTEXT
in ‘configure.ac’ should read:
‘AM_GNU_GETTEXT([external])’, and internationalization will not be
enabled on systems lacking GNU gettext.
Specify a directory containing PO files. Such a directory contains the translations into various languages of a particular POT file. This option can be specified multiple times, once for each translation domain. If it is not specified, the directory named ‘po/’ is updated.
Don't update or create ChangeLog files. By default, gettextize
logs
all changes (file additions, modifications and removals) in a file called
‘ChangeLog’ in each affected directory.
Make symbolic links instead of copying the needed files. This can be useful
to save a few kilobytes of disk space, but it requires extra effort to
create self-contained tarballs, it may disturb some mechanism the maintainer
applies to the sources, and it is likely to introduce bugs when a newer
version of gettext
is installed on the system.
Print modifications but don't perform them. All actions that
gettextize
would normally execute are inhibited and instead only
listed on standard output.
Display this help and exit.
Output version information and exit.
Si le répertoire est donnée, ce sera la répertoire de plus haut niveau
d'un progicielse préparant utiliser GNU gettext
. S'il n'est pas
donné, il est assumé que le répertoirecourant est le répertoire de plus haut
niveau de ce progiciel.
Le programme gettextize
fournit les fichiers qui suivent. Cependant,
aucun fichierexistant ne sera remplacé, sauf si l'option --force
(-f
) a été spécifiée.
gettextize
, si vous
en avez une de disponible. Vous pouvez aussi allerchercher une copie plus
récente de ce fichier ‘ABOUT-NLS’ sur les sites duProjet de traductions
et sur la plupart des sites des archives GNU.
gettext
(faites
attention au double ‘.in’ dans le nom de fichier) et quelques fichiers
auxiliaires. Si le répertoire ‘/po’ existe déjà , il sera préservé avec
les fichiers qu'il contient et seulement lefichier ‘Makefile.in.in’ et
les fichiers auxiliaires seront remplacés.
Si ‘--po-dir’ a été spécifiée, ceci sera valable pour tous les répertoires listés par ‘--po-dir’ à la place du répertoire ‘po/’.
gettext
. Si l'option --force
(-f
) est aussi donnée, le
répertoire ‘intl/’ est d'abord vidé.
AM_GNU_GETTEXT
.
automake
: un jeu de
fichiers macro autoconf
est copié dans le dépot de macro
autoconf
du progiciel, habituellement dans un répertoire appelé
‘m4/’.
Si votre site supporte les liens symboliques, gettextize
ne copiera
pas réellement les fichiers dans votre progiciel mais il établira à la place
un lien symbolique. Ceci évite les duplications et diminue l'espace disque
nécessaire pour tous les progiciels. Utiliser seulement l'option ‘-h’
pendant la création de l'archive tar
de votre distribution remplacera
de fait chaque lien par la copie effective dans l'archive de la
distribution. C'est pourquoi, pour insister, vous devriez vraiment utiliser
l'option ‘-h’ avec tar
à l'intérieur la cible de votre
distribution
de votre fichier principal ‘Makefile.in’.
D'autre part, gettextize
mettra à jour tous les fichiers
‘Makefile.am’ dans chacun des répertoires affectés, comme aussi le
fichier ‘configure.ac’ ou @file[configue.in’ du niveau hiéarchique le
plus élevé.
Il est intéressant de comprendre que la plupart des nouveaux fichiers crées
pour supporter les fonctionalité de GNU gettext
dans un progiciel,
vont dans les sous répertoires ‘intl’, ‘po/’ et ‘m4/’. Une
distinction entre ‘intl/’ et les deux autres répertoires es que le
fichier ‘intl/’ est prévu pour être complètement identique dans tous
les progiciels utilisant GNU gettext
, alors quels autres répertoires
contiendront principalement les fichiers dépendants du progiciel.
Le programme gettextize
fait des fichiers de sauvergarde pour tous
les fichiers qu'il remplace ou change et il écrit aussi des entrées dans le
ChangeLog pour ces modifications.De cette manière, un mainteneur prudent
peut vérifier après avoir fait tourner gettextize
si les changements
sont acceptables pour lui et il peut les ajuster. Une exception à cette
règle est le répertoire ‘intl/’, qui est ajouté, remplacé ou enlevé en
entier.
Il est important de comprendre que gettextize
ne peut pas faire tout
le travail d'adaptation d'un progiciel pour utiliser GNU gettext
. La
quantité de travail restant à faire dépend si le progiciel utilise GNU
automake
ou non.Mais dans tous les cas, le mainteneur devrait lir la
section si après avoir utilisé ‘gettextize’ vous avez une erreur ‘AC_COMPILE_IFELSE was called before AC_GNU_SOURCE’ ou ‘AC_RUN_IFELSE was called before AC_GNU_SOURCE’: (comme décrit dans ‘configure.ac’ au niveau sommet. Il est aussi important de comprendre que gettextize
ne fait pas partie du système d'intégration GNU)ajustement des fichiers]après avoir invoqué gettextize
. En particulier section `vous pouvez le corrigez en modifiant ‘configure.ac’' dans au sens qu'il ne ne devrait pas être invoqué automatiquement et ne devrait pas être invoqué par quelqu'un qui n'assume pas la responsabilité du maintien d'un progiciel. Pour les utilisations futures, un outil séparré est fourni, voir @ref{invocation d'autopoint}
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Besides files which are automatically added through gettextize
, there
are many files needing revision for properly interacting with GNU
gettext
. If you are closely following GNU standards for Makefile
engineering and auto-configuration, the adaptations should be easier to
achieve. Here is a point by point description of the changes needed in
each.
So, here comes a list of files, each one followed by a description of all
alterations it needs. Many examples are taken out from the GNU
gettext
0.17 distribution itself, or from the GNU
hello
distribution (http://www.franken.de/users/gnu/ke/hello
or http://www.gnu.franken.de/ke/hello/) You may indeed refer to the
source code of the GNU gettext
and GNU hello
packages, as they
are intended to be good examples for using GNU gettext functionality.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The ‘po/’ directory should receive a file named ‘POTFILES.in’. This file tells which files, among all program sources, have marked strings needing translation. Here is an example of such a file:
# List of source files containing translatable strings. # Copyright (C) 1995 Free Software Foundation, Inc. # Common library files lib/error.c lib/getopt.c lib/xmalloc.c # Package source files src/gettext.c src/msgfmt.c src/xgettext.c |
Hash-marked comments and white lines are ignored. All other lines list those source files containing strings marked for translation (@pxref{Mark Keywords}), in a notation relative to the top level of your whole distribution, rather than the location of the ‘POTFILES.in’ file itself.
When a C file is automatically generated by a tool, like flex
or
bison
, that doesn't introduce translatable strings by itself, it is
recommended to list in ‘po/POTFILES.in’ the real source file (ending in
‘.l’ in the case of flex
, or in ‘.y’ in the case of
bison
), not the generated C file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The ‘po/’ directory should also receive a file named ‘LINGUAS’. This file contains the list of available translations. It is a whitespace separated list. Hash-marked comments and white lines are ignored. Here is an example file:
# Set of available languages. de fr |
This example means that German and French PO files are available, so that
these languages are currently supported by your package. If you want to
further restrict, at installation time, the set of installed languages, this
should not be done by modifying the ‘LINGUAS’ file, but rather by using
the LINGUAS
environment variable (@pxref{Installers}).
It is recommended that you add the "languages" ‘en@quot’ and
‘en@boldquot’ to the LINGUAS
file. en@quot
is a
variant of English message catalogs (en
) which uses real quotation
marks instead of the ugly looking asymmetric ASCII substitutes ‘`’ and
‘'’. en@boldquot
is a variant of en@quot
that
additionally outputs quoted pieces of text in a bold font, when used in a
terminal emulator which supports the VT100 escape sequences (such as
xterm
or the Linux console, but not Emacs in M-x shell mode).
These extra message catalogs ‘en@quot’ and ‘en@boldquot’ are
constructed automatically, not by translators; to support them, you need the
files ‘Rules-quot’, ‘quot.sed’, ‘boldquot.sed’,
‘en@quot.header’, ‘en@boldquot.header’, ‘insert-header.sin’
in the ‘po/’ directory. You can copy them from GNU gettext's
‘po/’ directory; they are also installed by running gettextize
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The ‘po/’ directory also has a file named ‘Makevars’. It contains variables that are specific to your project. ‘po/Makevars’ gets inserted into the ‘po/Makefile’ when the latter is created. The variables thus take effect when the POT file is created or updated, and when the message catalogs get installed.
The first three variables can be left unmodified if your package has a single message domain and, accordingly, a single ‘po/’ directory. Only packages which have multiple ‘po/’ directories at different locations need to adjust the three first variables defined in ‘Makevars’.
As an alternative to the XGETTEXT_OPTIONS
variables, it is also
possible to specify xgettext
options through the
AM_XGETTEXT_OPTION
autoconf macro. See AM_XGETTEXT_OPTION in ‘po.m4’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
All files called ‘Rules-*’ in the ‘po/’ directory get appended to the ‘po/Makefile’ when it is created. They present an opportunity to add rules for special PO files to the Makefile, without needing to mess with ‘po/Makefile.in.in’.
GNU gettext comes with a ‘Rules-quot’ file, containing rules for
building catalogs ‘en@quot.po’ and ‘en@boldquot.po’. The effect
of ‘en@quot.po’ is that people who set their LANGUAGE
environment variable to ‘en@quot’ will get messages with proper
looking symmetric Unicode quotation marks instead of abusing the ASCII grave
accent and the ASCII apostrophe for indicating quotations. To enable this
catalog, simply add en@quot
to the ‘po/LINGUAS’ file. The
effect of ‘en@boldquot.po’ is that people who set LANGUAGE
to
‘en@boldquot’ will get not only proper quotation marks, but also the
quoted text will be shown in a bold font on terminals and consoles. This
catalog is useful only for command-line programs, not GUI programs. To
enable it, similarly add en@boldquot
to the ‘po/LINGUAS’ file.
Similarly, you can create rules for building message catalogs for the ‘sr@latin’ locale – Serbian written with the Latin alphabet – from those for the ‘sr’ locale – Serbian written with Cyrillic letters. See @ref{msgfilter Invocation}.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
‘configure.ac’ or ‘configure.in’ - this is the source from which
autoconf
generates the ‘configure’ script.
This is done by a set of lines like these:
PACKAGE=gettext VERSION=0.17 AC_DEFINE_UNQUOTED(PACKAGE, "$PACKAGE") AC_DEFINE_UNQUOTED(VERSION, "$VERSION") AC_SUBST(PACKAGE) AC_SUBST(VERSION) |
or, if you are using GNU automake
, by a line like this:
AM_INIT_AUTOMAKE(gettext, 0.17) |
Of course, you replace ‘gettext’ with the name of your package, and
‘0.17’ by its version numbers, exactly as they should appear
in the packaged tar
file name of your distribution
(‘gettext-0.17.tar.gz’, here).
Here is the main m4
macro for triggering internationalization
support. Just add this line to ‘configure.ac’:
AM_GNU_GETTEXT |
This call is purposely simple, even if it generates a lot of configure time checking and actions.
If you have suppressed the ‘intl/’ subdirectory by calling
gettextize
without ‘--intl’ option, this call should read
AM_GNU_GETTEXT([external]) |
The AC_OUTPUT
directive, at the end of your ‘configure.ac’ file,
needs to be modified in two ways:
AC_OUTPUT([existing configuration files intl/Makefile po/Makefile.in], [existing additional actions]) |
The modification to the first argument to AC_OUTPUT
asks for
substitution in the ‘intl/’ and ‘po/’ directories. Note the
‘.in’ suffix used for ‘po/’ only. This is because the distributed
file is really ‘po/Makefile.in.in’.
If you have suppressed the ‘intl/’ subdirectory by calling
gettextize
without ‘--intl’ option, then you don't need to add
intl/Makefile
to the AC_OUTPUT
line.
If, after doing the recommended modifications, a command like ‘aclocal -I m4’ or ‘autoconf’ or ‘autoreconf’ fails with a trace similar to this:
configure.ac:44: warning: AC_COMPILE_IFELSE was called before AC_GNU_SOURCE ../../lib/autoconf/specific.m4:335: AC_GNU_SOURCE is expanded from... m4/lock.m4:224: gl_LOCK is expanded from... m4/gettext.m4:571: gt_INTL_SUBDIR_CORE is expanded from... m4/gettext.m4:472: AM_INTL_SUBDIR is expanded from... m4/gettext.m4:347: AM_GNU_GETTEXT is expanded from... configure.ac:44: the top level configure.ac:44: warning: AC_RUN_IFELSE was called before AC_GNU_SOURCE |
you need to add an explicit invocation of ‘AC_GNU_SOURCE’ in the ‘configure.ac’ file - after ‘AC_PROG_CC’ but before ‘AM_GNU_GETTEXT’, most likely very close to the ‘AC_PROG_CC’ invocation. This is necessary because of ordering restrictions imposed by GNU autoconf.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
If you haven't suppressed the ‘intl/’ subdirectory, you need to add the GNU ‘config.guess’ and ‘config.sub’ files to your distribution. They are needed because the ‘intl/’ directory has platform dependent support for determining the locale's character encoding and therefore needs to identify the platform.
You can obtain the newest version of ‘config.guess’ and ‘config.sub’ from the CVS of the ‘config’ project at ‘http://savannah.gnu.org/’. The commands to fetch them are
$ wget 'http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/config/config/config.guess' $ wget 'http://savannah.gnu.org/cgi-bin/viewcvs/*checkout*/config/config/config.sub' |
Less recent versions are also contained in the GNU automake
and GNU
libtool
packages.
Normally, ‘config.guess’ and ‘config.sub’ are put at the top level of a distribution. But it is also possible to put them in a subdirectory, altogether with other configuration support files like ‘install-sh’, ‘ltconfig’, ‘ltmain.sh’ or ‘missing’. All you need to do, other than moving the files, is to add the following line to your ‘configure.ac’.
AC_CONFIG_AUX_DIR([subdir]) |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
With earlier versions of GNU gettext, you needed to add the GNU ‘mkinstalldirs’ script to your distribution. This is not needed any more. You can remove it if you not also using an automake version older than automake 1.9.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
If you do not have an ‘aclocal.m4’ file in your distribution, the
simplest is to concatenate the files ‘codeset.m4’, ‘gettext.m4’,
‘glibc2.m4’, ‘glibc21.m4’, ‘iconv.m4’, ‘intdiv0.m4’,
‘intl.m4’, ‘intldir.m4’, ‘intlmacosx.m4’, ‘intmax.m4’,
‘inttypes_h.m4’, ‘inttypes-pri.m4’, ‘lcmessage.m4’,
‘lib-ld.m4’, ‘lib-link.m4’, ‘lib-prefix.m4’, ‘lock.m4’,
‘longlong.m4’, ‘nls.m4’, ‘po.m4’, ‘printf-posix.m4’,
‘progtest.m4’, ‘size_max.m4’, ‘stdint_h.m4’,
‘uintmax_t.m4’, ‘visibility.m4’, ‘wchar_t.m4’,
‘wint_t.m4’, ‘xsize.m4’ from GNU gettext
's ‘m4/’
directory into a single file. If you have suppressed the ‘intl/’
directory, only ‘gettext.m4’, ‘iconv.m4’, ‘lib-ld.m4’,
‘lib-link.m4’, ‘lib-prefix.m4’, ‘nls.m4’, ‘po.m4’,
‘progtest.m4’ need to be concatenated.
If you are not using GNU automake
1.8 or newer, you will need to add
a file ‘mkdirp.m4’ from a newer automake distribution to the list of
files above.
If you already have an ‘aclocal.m4’ file, then you will have to merge
the said macro files into your ‘aclocal.m4’. Note that if you are
upgrading from a previous release of GNU gettext
, you should most
probably replace the macros (AM_GNU_GETTEXT
, etc.), as they
usually change a little from one release of GNU gettext
to the next.
Their contents may vary as we get more experience with strange systems out
there.
If you are using GNU automake
1.5 or newer, it is enough to put these
macro files into a subdirectory named ‘m4/’ and add the line
ACLOCAL_AMFLAGS = -I m4 |
to your top level ‘Makefile.am’.
These macros check for the internationalization support functions and
related informations. Hopefully, once stabilized, these macros might be
integrated in the standard Autoconf set, because this piece of m4
code will be the same for all projects using GNU gettext
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Earlier GNU gettext
releases required to put definitions for
ENABLE_NLS
, HAVE_GETTEXT
and HAVE_LC_MESSAGES
,
HAVE_STPCPY
, PACKAGE
and VERSION
into an
‘acconfig.h’ file. This is not needed any more; you can remove them
from your ‘acconfig.h’ file unless your package uses them independently
from the ‘intl/’ directory.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The include file template that holds the C macros to be defined by
configure
is usually called ‘config.h.in’ and may be maintained
either manually or automatically.
If gettextize
has created an ‘intl/’ directory, this file must
be called ‘config.h.in’ and must be at the top level. If, however, you
have suppressed the ‘intl/’ directory by calling gettextize
without ‘--intl’ option, then you can choose the name of this file and
its location freely.
If it is maintained automatically, by use of the ‘autoheader’ program,
you need to do nothing about it. This is the case in particular if you are
using GNU automake
.
If it is maintained manually, and if gettextize
has created an
‘intl/’ directory, you should switch to using ‘autoheader’. The
list of C macros to be added for the sake of the ‘intl/’ directory is
just too long to be maintained manually; it also changes between different
versions of GNU gettext
.
If it is maintained manually, and if on the other hand you have suppressed
the ‘intl/’ directory by calling gettextize
without
‘--intl’ option, then you can get away by adding the following lines to
‘config.h.in’:
/* Define to 1 if translation of program messages to the user's native language is requested. */ #undef ENABLE_NLS |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Here are a few modifications you need to make to your main, top-level ‘Makefile.in’ file.
PACKAGE = @PACKAGE@ VERSION = @VERSION@ |
DISTFILES
definition, so the file
gets distributed.
If you are using Makefiles, either generated by automake, or hand-written so they carefully follow the GNU coding standards, the effected goals for which the new subdirectories must be handled include ‘installdirs’, ‘install’, ‘uninstall’, ‘clean’, ‘distclean’.
Here is an example of a canonical order of processing. In this example, we
also define SUBDIRS
in Makefile.in
for it to be further used
in the ‘dist:’ goal.
SUBDIRS = doc intl lib src po |
Note that you must arrange for ‘make’ to descend into the intl
directory before descending into other directories containing code which
make use of the libintl.h
header file. For this reason, here we
mention intl
before lib
and src
.
distdir = $(PACKAGE)-$(VERSION) dist: Makefile rm -fr $(distdir) mkdir $(distdir) chmod 777 $(distdir) for file in $(DISTFILES); do \ ln $$file $(distdir) 2>/dev/null || cp -p $$file $(distdir); \ done for subdir in $(SUBDIRS); do \ mkdir $(distdir)/$$subdir || exit 1; \ chmod 777 $(distdir)/$$subdir; \ (cd $$subdir && $(MAKE) $@) || exit 1; \ done tar chozf $(distdir).tar.gz $(distdir) rm -fr $(distdir) |
Note that if you are using GNU automake
, ‘Makefile.in’ is
automatically generated from ‘Makefile.am’, and all needed changes to
‘Makefile.am’ are already made by running ‘gettextize’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Some of the modifications made in the main ‘Makefile.in’ will also be needed in the ‘Makefile.in’ from your package sources, which we assume here to be in the ‘src/’ subdirectory. Here are all the modifications needed in ‘src/Makefile.in’:
PACKAGE = @PACKAGE@ VERSION = @VERSION@ |
top_srcdir
gets
defined. This will serve for cpp
include files. Just add the line:
top_srcdir = @top_srcdir@ |
subdir
as ‘src’, later allowing
for almost uniform ‘dist:’ goals in all your ‘Makefile.in’. At
list, the ‘dist:’ goal below assume that you used:
subdir = src |
main
function of your program will normally call
bindtextdomain
(see @pxref{Triggering}), like this:
bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE); |
To make LOCALEDIR known to the program, add the following lines to ‘Makefile.in’:
datadir = @datadir@ localedir = $(datadir)/locale DEFS = -DLOCALEDIR=\"$(localedir)\" @DEFS@ |
Note that @datadir@
defaults to ‘$(prefix)/share’, thus
$(localedir)
defaults to ‘$(prefix)/share/locale’.
@LIBINTL@
or
@LTLIBINTL@
as a library. @LIBINTL@
is for use without
libtool
, @LTLIBINTL@
is for use with libtool
. An
easy way to achieve this is to manage that it gets into LIBS
, like
this:
LIBS = @LIBINTL@ @LIBS@ |
In most packages internationalized with GNU gettext
, one will find a
directory ‘lib/’ in which a library containing some helper functions
will be build. (You need at least the few functions which the GNU
gettext
Library itself needs.) However some of the functions in the
‘lib/’ also give messages to the user which of course should be
translated, too. Taking care of this, the support library (say
‘libsupport.a’) should be placed before @LIBINTL@
and
@LIBS@
in the above example. So one has to write this:
LIBS = ../lib/libsupport.a @LIBINTL@ @LIBS@ |
distdir = ../$(PACKAGE)-$(VERSION)/$(subdir) dist: Makefile $(DISTFILES) for file in $(DISTFILES); do \ ln $$file $(distdir) 2>/dev/null || cp -p $$file $(distdir) || exit 1; \ done |
Note that if you are using GNU automake
, ‘Makefile.in’ is
automatically generated from ‘Makefile.am’, and the first three changes
and the last change are not necessary. The remaining needed
‘Makefile.am’ modifications are the following:
<module>_CPPFLAGS = -DLOCALEDIR=\"$(localedir)\" |
for each specific module or compilation unit, or
AM_CPPFLAGS = -DLOCALEDIR=\"$(localedir)\" |
for all modules and compilation units together. Furthermore, add this line to define ‘localedir’:
localedir = $(datadir)/locale |
@LIBINTL@
or
@LTLIBINTL@
as a library, add the following to ‘Makefile.am’:
<program>_LDADD = @LIBINTL@ |
for each specific program, or
LDADD = @LIBINTL@ |
for all programs together. Remember that when you use libtool
to
link a program, you need to use @LTLIBINTL@ instead of @LIBINTL@ for
that program.
gettextize
, then to ensure that it will be searched for C
preprocessor include files in all circumstances, add something like this to
‘Makefile.am’:
AM_CPPFLAGS = -I../intl -I$(top_srcdir)/intl |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Internationalization of packages, as provided by GNU gettext
, is
optional. It can be turned off in two situations:
intl/
subdirectory, and the
libintl.h header (with its associated libintl library, if any) is not
already installed on the system, it is preferable that the package builds
without internationalization support, rather than to give a compilation
error.
A C preprocessor macro can be used to detect these two cases. Usually, when
libintl.h
was found and not explicitly disabled, the
ENABLE_NLS
macro will be defined to 1 in the autoconf generated
configuration file (usually called ‘config.h’). In the two negative
situations, however, this macro will not be defined, thus it will evaluate
to 0 in C preprocessor expressions.
‘gettext.h’ is a convenience header file for conditional use of
‘<libintl.h>’, depending on the ENABLE_NLS
macro. If
ENABLE_NLS
is set, it includes ‘<libintl.h>’; otherwise it
defines no-op substitutes for the libintl.h functions. We recommend the use
of "gettext.h"
over direct use of ‘<libintl.h>’, so that
portability to older systems is guaranteed and installers can turn off
internationalization if they want to. In the C code, you will then write
#include "gettext.h" |
instead of
#include <libintl.h> |
The location of gettext.h
is usually in a directory containing
auxiliary include files. In many GNU packages, there is a directory
‘lib/’ containing helper functions; ‘gettext.h’ fits there. In
other packages, it can go into the ‘src’ directory.
Do not install the gettext.h
file in public locations. Every package
that needs it should contain a copy of it on its own.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
GNU gettext
installs macros for use in a package's
‘configure.ac’ or ‘configure.in’. Voir (autoconf)Top section `Introduction' dans The Autoconf Manual. The primary macro is, of course,
AM_GNU_GETTEXT
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
presence of the GNU gettext function family in either the C library or a
separate libintl
library (shared or static libraries are both
supported) or in the package's ‘intl/’ directory. It also invokes
AM_PO_SUBDIRS
, thus preparing the ‘po/’ directories of the
package for building.
AM_GNU_GETTEXT
accepts up to three optional arguments. The general
syntax is
AM_GNU_GETTEXT([intlsymbol], [needsymbol], [intldir]) |
intlsymbol can be ‘external’ or ‘no-libtool’. The default
(if it is not specified or empty) is ‘no-libtool’. intlsymbol
should be ‘external’ for packages with no ‘intl/’ directory. For
packages with an ‘intl/’ directory, you can either use an
intlsymbol equal to ‘no-libtool’, or you can use ‘external’
and override by using the macro AM_GNU_GETTEXT_INTL_SUBDIR
elsewhere. The two ways to specify the existence of an ‘intl/’
directory are equivalent. At build time, a static library
$(top_builddir)/intl/libintl.a
will then be created.
If needsymbol is specified and is ‘need-ngettext’, then GNU
gettext implementations (in libc or libintl) without the ngettext()
function will be ignored. If needsymbol is specified and is
‘need-formatstring-macros’, then GNU gettext implementations that don't
support the ISO C 99 ‘<inttypes.h>’ formatstring macros will be
ignored. Only one needsymbol can be specified. These requirements
can also be specified by using the macro AM_GNU_GETTEXT_NEED
elsewhere. To specify more than one requirement, just specify the strongest
one among them, or invoke the AM_GNU_GETTEXT_NEED
macro several
times. The hierarchy among the various alternatives is as follows:
‘need-formatstring-macros’ implies ‘need-ngettext’.
intldir is used to find the intl libraries. If empty, the value ‘$(top_builddir)/intl/’ is used.
The AM_GNU_GETTEXT
macro determines whether GNU gettext is available
and should be used. If so, it sets the USE_NLS
variable to
‘yes’; it defines ENABLE_NLS
to 1 in the autoconf generated
configuration file (usually called ‘config.h’); it sets the variables
LIBINTL
and LTLIBINTL
to the linker options for use in a
Makefile (LIBINTL
for use without libtool, LTLIBINTL
for use
with libtool); it adds an ‘-I’ option to CPPFLAGS
if necessary.
In the negative case, it sets USE_NLS
to ‘no’; it sets
LIBINTL
and LTLIBINTL
to empty and doesn't change
CPPFLAGS
.
The complexities that AM_GNU_GETTEXT
deals with are the following:
gettext
in the C library, for example
glibc. Some have it in a separate library libintl
. GNU
libintl
might have been installed as part of the GNU gettext
package.
libintl
, if installed, is not necessarily already in the search
path (CPPFLAGS
for the include file search path, LDFLAGS
for
the library search path).
gettext
cannot
exploit the GNU mo files, doesn't have the necessary locale dependency
features, and cannot convert messages from the catalog's text encoding to
the user's locale encoding.
libintl
, if installed, is not necessarily already in the run time
library search path. To avoid the need for setting an environment variable
like LD_LIBRARY_PATH
, the macro adds the appropriate run time search
path options to the LIBINTL
and LTLIBINTL
variables. This
works on most systems, but not on some operating systems with limited shared
library support, like SCO.
libintl
relies on POSIX/XSI iconv
. The macro checks for
linker options needed to use iconv and appends them to the LIBINTL
and LTLIBINTL
variables.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
declares the version number of the GNU gettext infrastructure that is used by the package.
The use of this macro is optional; only the autopoint
program makes
use of it (@pxref{CVS Issues}).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
constraint regarding the GNU gettext implementation. The syntax is
AM_GNU_GETTEXT_NEED([needsymbol]) |
If needsymbol is ‘need-ngettext’, then GNU gettext
implementations (in libc or libintl) without the ngettext()
function
will be ignored. If needsymbol is ‘need-formatstring-macros’,
then GNU gettext implementations that don't support the ISO C 99
‘<inttypes.h>’ formatstring macros will be ignored.
The optional second argument of AM_GNU_GETTEXT
is also taken into
account.
The AM_GNU_GETTEXT_NEED
invocations can occur before or after the
AM_GNU_GETTEXT
invocation; the order doesn't matter.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
macro specifies that the AM_GNU_GETTEXT
macro, although invoked with
the first argument ‘external’, should also prepare for building the
‘intl/’ subdirectory.
The AM_GNU_GETTEXT_INTL_SUBDIR
invocation can occur before or after
the AM_GNU_GETTEXT
invocation; the order doesn't matter.
The use of this macro requires GNU automake 1.10 or newer and GNU autoconf 2.61 or newer.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
‘po/’ directories of the package for building. This macro should be
used in internationalized programs written in other programming languages
than C, C++, Objective C, for example sh
, Python
,
Lisp
. See @ref{Programming Languages} for a list of programming
languages that support localization through PO files.
The AM_PO_SUBDIRS
macro determines whether internationalization
should be used. If so, it sets the USE_NLS
variable to ‘yes’,
otherwise to ‘no’. It also determines the right values for Makefile
variables in each ‘po/’ directory.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
command-line option to be used in the invocations of xgettext
in the
‘po/’ directories of the package.
For example, if you have a source file that defines a function ‘error_at_line’ whose fifth argument is a format string, you can use
AM_XGETTEXT_OPTION([--flag=error_at_line:5:c-format]) |
to instruct xgettext
to mark all translatable strings in
‘gettext’ invocations that occur as fifth argument to this function as
‘c-format’.
See @ref{xgettext Invocation} for the list of options that xgettext
accepts.
The use of this macro is an alternative to the use of the ‘XGETTEXT_OPTIONS’ variable in ‘po/Makevars’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
POSIX/XSI iconv
function family in either the C library or a separate
libiconv
library. If found, it sets the am_cv_func_iconv
variable to ‘yes’; it defines HAVE_ICONV
to 1 in the autoconf
generated configuration file (usually called ‘config.h’); it defines
ICONV_CONST
to ‘const’ or to empty, depending on whether the
second argument of iconv()
is of type ‘const char **’ or
‘char **’; it sets the variables LIBICONV
and LTLIBICONV
to the linker options for use in a Makefile (LIBICONV
for use without
libtool, LTLIBICONV
for use with libtool); it adds an ‘-I’
option to CPPFLAGS
if necessary. If not found, it sets
LIBICONV
and LTLIBICONV
to empty and doesn't change
CPPFLAGS
.
The complexities that AM_ICONV
deals with are the following:
iconv
in the C library, for example
glibc. Some have it in a separate library libiconv
, for example
OSF/1 or FreeBSD. Regardless of the operating system, GNU libiconv
might have been installed. In that case, it should be used instead of the
operating system's native iconv
.
libiconv
, if installed, is not necessarily already in the search
path (CPPFLAGS
for the include file search path, LDFLAGS
for
the library search path).
libiconv
is binary incompatible with some operating system's
native iconv
, for example on FreeBSD. Use of an ‘iconv.h’ and
‘libiconv.so’ that don't fit together would produce program crashes.
libiconv
, if installed, is not necessarily already in the run
time library search path. To avoid the need for setting an environment
variable like LD_LIBRARY_PATH
, the macro adds the appropriate run
time search path options to the LIBICONV
variable. This works on
most systems, but not on some operating systems with limited shared library
support, like SCO.
‘iconv.m4’ is distributed with the GNU gettext package because ‘gettext.m4’ relies on it.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Many projects use CVS for distributed development, version control and
source backup. This section gives some advice how to manage the uses of
cvs
, gettextize
, autopoint
and autoconf
.
13.6.1 Éviter les incohérences dans un développement distribué. | ||
13.6.2 Les fichiers à mettre sous le control de version CVS | ||
13.6.3 Invocation du programme autopoint |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
In a project development with multiple developers, using CVS, there should
be a single developer who occasionally - when there is desire to upgrade to
a new gettext
version - runs gettextize
and performs the
changes listed in @ref{Adjusting Files}, and then commits his changes to the
CVS.
It is highly recommended that all developers on a project use the same
version of GNU gettext
in the package. In other words, if a
developer runs gettextize
, he should go the whole way, make the
necessary remaining changes and commit his changes to the CVS. Otherwise
the following damages will likely occur:
gettext
specific portions in ‘configure.ac’, ‘configure.in’ and
Makefile.am
, Makefile.in
files depend on the gettext
version, the use of infrastructure files belonging to different
gettext
versions can easily lead to build errors.
gettext
than the other developers, the distribution will be less well
tested than if all had been using the same gettext
version. For
example, it is possible that a platform specific bug goes undiscovered due
to this constellation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
There are basically three ways to deal with generated files in the context
of a CVS repository, such as ‘configure’ generated from
‘configure.ac’, parser.c
generated from
parser.y
, or po/Makefile.in.in
autoinstalled by
gettextize
or autopoint
.
Each of these three approaches has different advantages and drawbacks.
automake
, GNU
autoconf
, GNU m4
installed in his PATH; sometimes he even
needs particular versions of them. 2b. When a release is made and a commit
is made on the generated files, the other developers get conflicts on the
generated files after doing "cvs update". Although these conflicts are easy
to resolve, they are annoying.
automake
, GNU
autoconf
, GNU m4
installed in his PATH, but also that he needs
to perform a package specific pre-build step before being able to
"./configure; make".
For the first and second approach, all files modified or brought in by the
occasional gettextize
invocation and update should be committed into
the CVS.
For the third approach, the maintainer can omit from the CVS repository all
the files that gettextize
mentions as "copy". Instead, he adds to
the ‘configure.ac’ or ‘configure.in’ a line of the form
AM_GNU_GETTEXT_VERSION(0.17) |
and adds to the package's pre-build script an invocation of
‘autopoint’. For everyone who checks out the CVS, this
autopoint
invocation will copy into the right place the
gettext
infrastructure files that have been omitted from the CVS.
The version number used as argument to AM_GNU_GETTEXT_VERSION
is the
version of the gettext
infrastructure that the package wants to use.
It is also the minimum version number of the ‘autopoint’ program. So,
if you write AM_GNU_GETTEXT_VERSION(0.11.5)
then the developers can
have any version >= 0.11.5 installed; the package will work with the 0.11.5
infrastructure in all developers' builds. When the maintainer then runs
gettextize from, say, version 0.12.1 on the package, the occurrence of
AM_GNU_GETTEXT_VERSION(0.11.5)
will be changed into
AM_GNU_GETTEXT_VERSION(0.12.1)
, and all other developers that use the
CVS will henceforth need to have GNU gettext
0.12.1 or newer
installed.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
autopoint
autopoint [option]... |
Le programme autopoint
copie les fichiers standards de
l'infrastructure de gettext dans le paquet des fichiers sources. Il
extrait la version de gettext utilisée par le progiciel depuis un appel par
macro de la forme AM_GNU_GETTEXT_VERSION(version)
– macros
que l'on trouve dans les fichiers ‘configure.in’ ou
‘configure.ac’ du progiciel – et il copie les fichiers de
l'infrastructure correspondant à cette version de gettext dans le
progiciel.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Force overwriting of files that already exist.
Print modifications but don't perform them. All file copying actions that
autopoint
would normally execute are inhibited and instead only
listed on standard output.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Display this help and exit.
Output version information and exit.
autopoint
supporte les versions de GNU gettext
allant
de 0.10.35 Ã la version courante, 0.17. Pour appliquer
apply autopoint
à un progiciel utilisant une version de
gettext
postérieure à 0.17, vous aurez besoin d'installer
au moins cette même version de GNU gettext
.
Dans les progiciels utilisant GNU automake
, une invocation
de autopoint
devrait être suivie par des invocations de
aclocal
et ensuite autoconf
et autoheader
. La raison
en est que autopoint
installe certains fichiers de macros autoconf,
qui sont utilisés par aclocal
pour créer le fichier ‘aclocal.m4’
et ce dernier est utilisé par autoconf
pour créer les fichiers de
script ‘configure’ du progiciel et l'entête automatique
(ndt autoheader
) pour créer les fichiers modèles
include ‘config.h.in’ du progiciel.
Le nom ‘autopoint’ est une abréviation pour ‘auto-po-intl-m4’ ; l'outil copie ou met presque à jour les fichiers dans les répertoires ‘po’, ‘intl’, ‘m4’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
In projects that use GNU automake
, the usual commands for creating a
distribution tarball, ‘make dist’ or ‘make distcheck’,
automatically update the PO files as needed.
If GNU automake
is not used, the maintainer needs to perform this
update before making a release:
$ ./configure $ (cd po; make update-po) $ make distclean |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
By default, packages fully using GNU gettext
, internally, are
installed in such a way that they to allow translation of messages. At
configuration time, those packages should automatically detect
whether the underlying host system already provides the GNU gettext
functions. If not, the GNU gettext
library should be automatically
prepared and used. Installers may use special options at configuration time
for changing this behavior. The command ‘./configure
--with-included-gettext’ bypasses system gettext
to use the included
GNU gettext
instead, while ‘./configure --disable-nls’ produces
programs totally unable to translate messages.
Internationalized packages have usually many ‘ll.po’ files.
Unless translations are disabled, all those available are installed together
with the package. However, the environment variable LINGUAS
may be
set, prior to configuration, to limit the installed set. LINGUAS
should then contain a space separated list of two-letter codes, stating
which languages are allowed.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
While the presentation of gettext
focuses mostly on C and implicitly
applies to C++ as well, its scope is far broader than that: Many programming
languages, scripting languages and other textual data like GUI resources or
package descriptions can make use of the gettext approach.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
All programming and scripting languages that have the notion of strings are
eligible to supporting gettext
. Supporting gettext
means the
following:
gettext
would do, but a shorthand
syntax helps keeping the legibility of internationalized programs. For
example, in C we use the syntax _("string")
, and in GNU awk we use
the shorthand _"string"
.
gettext
function, or performs equivalent processing.
ngettext
, dcgettext
,
dcngettext
available from within the language. These functions are
less often used, but are nevertheless necessary for particular purposes:
ngettext
for correct plural handling, and dcgettext
and
dcngettext
for obeying other locale-related environment variables
than LC_MESSAGES
, such as LC_TIME
or LC_MONETARY
. For
these latter functions, you need to make the LC_*
constants,
available in the C header <locale.h>
, referenceable from within the
language, usually either as enumeration values or as strings.
textdomain
function available from within the language, or
by introducing a magic variable called TEXTDOMAIN
. Similarly, you
should allow the programmer to designate where to search for message
catalogs, by providing access to the bindtextdomain
function.
setlocale (LC_ALL, "")
call during the
startup of your language runtime, or allow the programmer to do so.
Remember that gettext will act as a no-op if the LC_MESSAGES
and
LC_CTYPE
locale categories are not both set.
xgettext
program is being extended
to support very different programming languages. Please contact the GNU
gettext
maintainers to help them doing this. If the string extractor
is best integrated into your language's parser, GNU xgettext
can
function as a front end to your string extractor.
gettext
, but the programs should be portable
across implementations, you should provide a no-i18n emulation, that makes
the other implementations accept programs written for yours, without
actually translating the strings.
gettext
maintainers, so they can add
support for your language to ‘po-mode.el’.
On the implementation side, three approaches are possible, with different effects on portability and copyright:
gettext
's ‘intl/’ directory in your
package, as described in @ref{Maintainers}. This allows you to have
internationalization on all kinds of platforms. Note that when you then
distribute your package, it legally falls under the GNU General Public
License, and the GNU project will be glad about your contribution to the
Free Software pool.
gettext
functions if they are found in the C
library. For example, an autoconf test for gettext()
and
ngettext()
will detect this situation. For the moment, this test
will succeed on GNU systems and not on other platforms. No severe copyright
restrictions apply.
gettext
functionality. This
has the advantage of full portability and no copyright restrictions, but
also the drawback that you have to reimplement the GNU gettext
features (such as the LANGUAGE
environment variable, the locale
aliases database, the automatic charset conversion, and plural handling).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
For the programmer, the general procedure is the same as for the C
language. The Emacs PO mode marking supports other languages, and the GNU
xgettext
string extractor recognizes other languages based on the
file extension or a command-line option. In some languages,
setlocale
is not needed because it is already performed by the
underlying language runtime.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The translator works exactly as in the C language case. The only difference is that when translating format strings, she has to be aware of the language's particular syntax for positional arguments in format strings.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
C format strings are described in POSIX (IEEE P1003.1 2001), section XSH 3 fprintf(), http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html. See also the fprintf() manual page, http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php, http://informatik.fh-wuerzburg.de/student/i510/man/printf.html.
Although format strings with positions that reorder arguments, such as
"Only %2$d bytes free on '%1$s'." |
which is semantically equivalent to
"'%s' has only %d bytes free." |
are a POSIX/XSI feature and not specified by ISO C 99, translators can rely
on this reordering ability: On the few platforms where printf()
,
fprintf()
etc. don't support this feature natively, ‘libintl.a’
or ‘libintl.so’ provides replacement functions, and GNU
<libintl.h>
activates these replacement functions automatically.
As a special feature for Farsi (Persian) and maybe Arabic, translators can
insert an ‘I’ flag into numeric format directives. For example, the
translation of "%d"
can be "%Id"
. The effect of this flag, on
systems with GNU libc
, is that in the output, the ASCII digits are
replaced with the ‘outdigits’ defined in the LC_CTYPE
locale
category. On other systems, the gettext
function removes this flag,
so that it has no effect.
Note that the programmer should not put this flag into the untranslated string. (Putting the ‘I’ format directive flag into an msgid string would lead to undefined behaviour on platforms without glibc when NLS is disabled.)
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Objective C format strings are like C format strings. They support an
additional format directive: "$@", which when executed consumes an argument
of type Object *
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Shell format strings, as supported by GNU gettext and the ‘envsubst’
program, are strings with references to shell variables in the form
$variable
or ${variable}
. References of the
form ${variable-default}
,
${variable:-default}
,
${variable=default}
,
${variable:=default}
,
${variable+replacement}
,
${variable:+replacement}
,
${variable?ignored}
,
${variable:?ignored}
, that would be valid inside shell
scripts, are not supported. The variable names must consist solely of
alphanumeric or underscore ASCII characters, not start with a digit and be
nonempty; otherwise such a variable reference is ignored.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Python format strings are described in Python Library reference / 2. Built-in Types, Exceptions and Functions / 2.2. Built-in Types / 2.2.6. Sequence Types / 2.2.6.2. String Formatting Operations. http://www.python.org/doc/2.2.1/lib/typesseq-strings.html.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Lisp format strings are described in the Common Lisp HyperSpec, chapter 22.3 Formatted Output, http://www.lisp.org/HyperSpec/Body/sec_22-3.html.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Emacs Lisp format strings are documented in the Emacs Lisp reference, section Formatting Strings, http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75. Note that as of version 21, XEmacs supports numbered argument specifications in format strings while FSF Emacs doesn't.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
librep format strings are documented in the librep manual, section Formatted Output, http://librep.sourceforge.net/librep-manual.html#Formatted%20Output, http://www.gwinnup.org/research/docs/librep.html#SEC122.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Scheme format strings are documented in the SLIB manual, section Format Specification.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Smalltalk format strings are described in the GNU Smalltalk documentation,
class CharArray
, methods ‘bindWith:’ and
‘bindWithArguments:’.
http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238.
In summary, a directive starts with ‘%’ and is followed by ‘%’ or
a nonzero digit (‘1’ to ‘9’).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Java format strings are described in the JDK documentation for class
java.text.MessageFormat
,
http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html.
See also the ICU documentation
http://oss.software.ibm.com/icu/apiref/classMessageFormat.html.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
C# format strings are described in the .NET documentation for class
System.String
and in
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
awk format strings are described in the gawk documentation, section Printf, http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Where is this documented?
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
YCP sformat strings are described in the libycp documentation file:/usr/share/doc/packages/libycp/YCP-builtins.html. In summary, a directive starts with ‘%’ and is followed by ‘%’ or a nonzero digit (‘1’ to ‘9’).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Tcl format strings are described in the ‘format.n’ manual page, http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
There are two kinds format strings in Perl: those acceptable to the Perl
built-in function printf
, labelled as ‘perl-format’, and those
acceptable to the libintl-perl
function __x
, labelled as
‘perl-brace-format’.
Perl printf
format strings are described in the sprintf
section of ‘man perlfunc’.
Perl brace format strings are described in the ‘Locale::TextDomain(3pm)’ manual page of the CPAN package libintl-perl. In brief, Perl format uses placeholders put between braces (‘{’ and ‘}’). The placeholder must have the syntax of simple identifiers.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
PHP format strings are described in the documentation of the PHP function
sprintf
, in ‘phpdoc/manual/function.sprintf.html’ or
http://www.php.net/manual/en/function.sprintf.php.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
These format strings are used inside the GCC sources. In such a format string, a directive starts with ‘%’, is optionally followed by a size specifier ‘l’, an optional flag ‘+’, another optional flag ‘#’, and is finished by a specifier: ‘%’ denotes a literal percent sign, ‘c’ denotes a character, ‘s’ denotes a string, ‘i’ and ‘d’ denote an integer, ‘o’, ‘u’, ‘x’ denote an unsigned integer, ‘.*s’ denotes a string preceded by a width specification, ‘H’ denotes a ‘location_t *’ pointer, ‘D’ denotes a general declaration, ‘F’ denotes a function declaration, ‘T’ denotes a type, ‘A’ denotes a function argument, ‘C’ denotes a tree code, ‘E’ denotes an expression, ‘L’ denotes a programming language, ‘O’ denotes a binary operator, ‘P’ denotes a function parameter, ‘Q’ denotes an assignment operator, ‘V’ denotes a const/volatile qualifier.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Qt format strings are described in the documentation of the QString class file:/usr/lib/qt-4.3.0/doc/html/qstring.html. In summary, a directive consists of a ‘%’ followed by a digit. The same directive cannot occur more than once in a format string.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
KDE 4 format strings are defined as follows: A directive consists of a ‘%’ followed by a non-zero decimal number. If a ‘%n’ occurs in a format strings, all of ‘%1’, ..., ‘%(n-1)’ must occur as well, except possibly one of them.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Boost format strings are described in the documentation of the
boost::format
class, at
http://www.boost.org/libs/format/doc/format.html. In summary, a
directive has either the same syntax as in a C format string, such as
‘%1$+5d’, or may be surrounded by vertical bars, such as
‘%|1$+5d|’ or ‘%|1$+5|’, or consists of just an argument number
between percent signs, such as ‘%1%’.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
For the maintainer, the general procedure differs from the C language case in two ways.
gettextize
program without the ‘--intl’ option, and that he
invokes the AM_GNU_GETTEXT
autoconf macro via
‘AM_GNU_GETTEXT([external])’.
XGETTEXT_OPTIONS
variable in ‘po/Makevars’ (voir la section ‘Makevars’ dans ‘po/’) should be adjusted to
match the xgettext
options for that particular programming language.
If the package uses more than one programming language with gettext
support, it becomes necessary to change the POT file construction rule in
‘po/Makefile.in.in’. It is recommended to make one xgettext
invocation per programming language, each with the options appropriate for
that language, and to combine the resulting files using msgcat
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gcc, gpp, gobjc, glibc, gettext
For C: c
, h
.
For C++: C
, c++
, cc
,
cxx
, cpp
, hpp
.
For Objective C: m
.
"abc"
_("abc")
gettext
, dgettext
, dcgettext
, ngettext
,
dngettext
, dcngettext
textdomain
function
bindtextdomain
function
Programmer must call setlocale (LC_ALL, "")
#include <libintl.h>
#include <locale.h>
#define
_(string) gettext (string)
Use
xgettext -k_
fprintf "%2$d %1$d"
In C++: autosprintf "%2$d %1$d"
(voir (autosprintf)Top section `Introduction' dans GNU autosprintf)
autoconf (gettext.m4) and #if ENABLE_NLS
yes
The following examples are available in the ‘examples’ directory:
hello-c
, hello-c-gnome
, hello-c++
, hello-c++-qt
,
hello-c++-kde
, hello-c++-gnome
, hello-c++-wxwidgets
,
hello-objc
, hello-objc-gnustep
, hello-objc-gnome
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
bash, gettext
sh
"abc"
, 'abc'
, abc
"`gettext \"abc\"`"
gettext
, ngettext
programs eval_gettext
,
eval_ngettext
shell functions
environment variable TEXTDOMAIN
environment variable TEXTDOMAINDIR
automatic
. gettext.sh
use
xgettext
—
fully portable
—
An example is available in the ‘examples’ directory: hello-sh
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Preparing a shell script for internationalization is conceptually similar to the steps described in @ref{Sources}. The concrete steps for shell scripts are as follows.
. gettext.sh |
near the top of the script. gettext.sh
is a shell function library
that provides the functions eval_gettext
(see @ref{eval_gettext
Invocation}) and eval_ngettext
(see @ref{eval_ngettext Invocation}).
You have to ensure that gettext.sh
can be found in the PATH
.
TEXTDOMAIN
and TEXTDOMAINDIR
environment
variables. Usually TEXTDOMAIN
is the package or program name, and
TEXTDOMAINDIR
is the absolute pathname corresponding to
$prefix/share/locale
, where $prefix
is the installation
location.
TEXTDOMAIN=@PACKAGE@ export TEXTDOMAIN TEXTDOMAINDIR=@LOCALEDIR@ export TEXTDOMAINDIR |
"`...`"
or "$(...)"
), variable access with
defaulting (like ${variable-default}
), access to
positional arguments (like $0
, $1
, ...) or highly volatile
shell variables (like $?
). This can always be done through simple
local code restructuring. For example,
echo "Usage: $0 [OPTION] FILE..." |
becomes
program_name=$0 echo "Usage: $program_name [OPTION] FILE..." |
Similarly,
echo "Remaining files: `ls | wc -l`" |
becomes
filecount="`ls | wc -l`" echo "Remaining files: $filecount" |
When doing this, you also need to add an extra backslash before the dollar sign in references to shell variables, so that the ‘eval_gettext’ function receives the translatable string before the variable values are substituted into it. For example,
echo "Remaining files: $filecount" |
becomes
eval_gettext "Remaining files: \$filecount"; echo |
If the output command is not ‘echo’, you can make it use ‘echo’ nevertheless, through the use of backquotes. However, note that inside backquotes, backslashes must be doubled to be effective (because the backquoting eats one level of backslashes). For example, assuming that ‘error’ is a shell function that signals an error,
error "file not found: $filename" |
is first transformed into
error "`echo \"file not found: \$filename\"`" |
which then becomes
error "`eval_gettext \"file not found: \\\$filename\"`" |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext.sh
gettext.sh
, contained in the run-time package of GNU gettext,
provides the following:
echo
is set to a command that outputs its first argument
and a newline, without interpreting backslashes in the argument string.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
gettext [option] [[textdomain] msgid] gettext [option] -s [msgid]... |
The gettext
program displays the native language translation of a
textual message.
Arguments
Retrieve translated messages from textdomain. Usually a textdomain corresponds to a package, a program, or a module of a program.
Enable expansion of some escape sequences. This option is for compatibility with the ‘echo’ program or shell built-in. The escape sequences ‘\a’, ‘\b’, ‘\c’, ‘\f’, ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\\’, and ‘\’ followed by one to three octal digits, are interpreted like the System V ‘echo’ program did.
This option is only for compatibility with the ‘echo’ program or shell built-in. It has no effect.
Display this help and exit.
Suppress trailing newline. By default, gettext
adds a newline to the
output.
Output version information and exit.
Retrieve translated message corresponding to msgid from textdomain.
If the textdomain parameter is not given, the domain is determined
from the environment variable TEXTDOMAIN
. If the message catalog is
not found in the regular directory, another location can be specified with
the environment variable TEXTDOMAINDIR
.
When used with the -s
option the program behaves like the ‘echo’
command. But it does not simply copy its arguments to stdout. Instead
those messages found in the selected catalog are translated.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
ngettext
ngettext [option] [textdomain] msgid msgid-plural count |
The ngettext
program displays the native language translation of a
textual message whose grammatical form depends on a number.
Arguments
Retrieve translated messages from textdomain. Usually a textdomain corresponds to a package, a program, or a module of a program.
Enable expansion of some escape sequences. This option is for compatibility with the ‘gettext’ program. The escape sequences ‘\a’, ‘\b’, ‘\c’, ‘\f’, ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\\’, and ‘\’ followed by one to three octal digits, are interpreted like the System V ‘echo’ program did.
This option is only for compatibility with the ‘gettext’ program. It has no effect.
Display this help and exit.
Output version information and exit.
Retrieve translated message from textdomain.
Translate msgid (English singular) / msgid-plural (English plural).
Choose singular/plural form based on this value.
If the textdomain parameter is not given, the domain is determined
from the environment variable TEXTDOMAIN
. If the message catalog is
not found in the regular directory, another location can be specified with
the environment variable TEXTDOMAINDIR
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
envsubst
envsubst [option] [shell-format] |
The envsubst
program substitutes the values of environment variables.
Operation mode
Output the variables occurring in shell-format.
Informative output
Display this help and exit.
Output version information and exit.
In normal operation mode, standard input is copied to standard output, with
references to environment variables of the form $VARIABLE
or
${VARIABLE}
being replaced with the corresponding values. If a
shell-format is given, only those environment variables that are
referenced in shell-format are substituted; otherwise all environment
variables references occurring in standard input are substituted.
These substitutions are a subset of the substitutions that a shell performs
on unquoted and double-quoted strings. Other kinds of substitutions done by
a shell, such as ${variable-default}
or
$(command-list)
or `command-list`
, are not
performed by the envsubst
program, due to security reasons.
When --variables
is used, standard input is ignored, and the output
consists of the environment variables that are referenced in
shell-format, one per line.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
eval_gettext
eval_gettext msgid |
This function outputs the native language translation of a textual message, performing dollar-substitution on the result. Note that only shell variables mentioned in msgid will be dollar-substituted in the result.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
eval_ngettext
eval_ngettext msgid msgid-plural count |
This function outputs the native language translation of a textual message whose grammatical form depends on a number, performing dollar-substitution on the result. Note that only shell variables mentioned in msgid or msgid-plural will be dollar-substituted in the result.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
GNU bash
2.0 or newer has a special shorthand for translating a
string and substituting variable values in it: $"msgid"
. But the use
of this construct is discouraged, due to the security holes it
opens and due to its portability problems.
The security holes of $"..."
come from the fact that after looking up
the translation of the string, bash
processes it like it processes
any double-quoted string: dollar and backquote processing, like ‘eval’
does.
0x60
. For example, the byte sequence \xe0\x60
is a single
character in these locales. Many versions of bash
(all versions up
to bash-2.05, and newer versions on platforms without mbsrtowcs()
function) don't know about character boundaries and see a backquote
character where there is only a particular Chinese character. Thus it can
start executing part of the translation as a command list. This situation
can occur even without the translator being aware of it: if the translator
provides translations in the UTF-8 encoding, it is the gettext()
function which will, during its conversion from the translator's encoding to
the user's locale's encoding, produce the dangerous \x60
bytes.
"`...`"
or dollar-parentheses "$(...)"
in her translations.
The enclosed strings would be executed as command lists by the shell.
The portability problem is that bash
must be built with
internationalization support; this is normally not the case on systems that
don't have the gettext()
function in libc.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
python
py
'abc'
, u'abc'
, r'abc'
, ur'abc'
, "abc"
,
u"abc"
, r"abc"
, ur"abc"
, '''abc'''
,
u'''abc'''
, r'''abc'''
, ur'''abc'''
,
"""abc"""
, u"""abc"""
, r"""abc"""
, ur"""abc"""
_('abc')
etc.
gettext.gettext
, gettext.dgettext
, gettext.ngettext
,
gettext.dngettext
, also ugettext
, ungettext
gettext.textdomain
function, or gettext.install(domain)
function
gettext.bindtextdomain
function, or
gettext.install(domain,localedir)
function
not used by the gettext emulation
import gettext
emulate
xgettext
'...%(ident)d...' % { 'ident': value }
fully portable
—
An example is available in the ‘examples’ directory:
hello-python
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
clisp 2.28 or newer
lisp
"abc"
(_ "abc")
, (ENGLISH "abc")
i18n:gettext
, i18n:ngettext
i18n:textdomain
i18n:textdomaindir
automatic
—
use
xgettext -k_ -kENGLISH
format "~1@*~D ~0@*~D"
On platforms without gettext, no translation.
—
An example is available in the ‘examples’ directory:
hello-clisp
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
clisp
d
"abc"
ENGLISH ? "abc" : ""
GETTEXT("abc")
GETTEXTL("abc")
clgettext
, clgettextl
—
—
automatic
#include "lispbibl.c"
use
clisp-xgettext
fprintf "%2$d %1$d"
On platforms without gettext, no translation.
—
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
emacs, xemacs
el
"abc"
(_"abc")
gettext
, dgettext
(xemacs only)
domain
special form (xemacs only)
bind-text-domain
function (xemacs only)
automatic
—
use
xgettext
format "%2$d %1$d"
Only XEmacs. Without I18N3
defined at build time, no translation.
—
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
librep 0.15.3 or newer
jl
"abc"
(_"abc")
gettext
textdomain
function
bindtextdomain
function
—
(require 'rep.i18n.gettext)
use
xgettext
format "%2$d %1$d"
On platforms without gettext, no translation.
—
An example is available in the ‘examples’ directory:
hello-librep
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
guile
scm
"abc"
(_ "abc")
gettext
, ngettext
textdomain
bindtextdomain
(catch #t (lambda () (setlocale LC_ALL "")) (lambda args #f))
(use-modules (ice-9 format))
use
xgettext -k_
—
On platforms without gettext, no translation.
—
An example is available in the ‘examples’ directory:
hello-guile
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
smalltalk
st
'abc'
NLS ? 'abc'
LcMessagesDomain>>#at:
, LcMessagesDomain>>#at:plural:with:
LcMessages>>#domain:localeDirectory:
(returns a
LcMessagesDomain
object).
Example: I18N Locale default
messages domain: 'gettext' localeDirectory: /usr/local/share/locale'
LcMessages>>#domain:localeDirectory:
, see above.
Automatic if you use I18N Locale default
.
PackageLoader fileInPackage: 'I18N'!
emulate
xgettext
'%1 %2' bindWith: 'Hello' with: 'world'
fully portable
—
An example is available in the ‘examples’ directory:
hello-smalltalk
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
java, java2
java
"abc"
_("abc")
GettextResource.gettext
, GettextResource.ngettext
,
GettextResource.pgettext
, GettextResource.npgettext
—, use ResourceBundle.getResource
instead
—, use CLASSPATH instead
automatic
—
—, uses a Java specific message catalog format
xgettext -k_
MessageFormat.format "{1,number} {0,number}"
fully portable
—
Before marking strings as internationalizable, uses of the string
concatenation operator need to be converted to MessageFormat
applications. For example, "file "+filename+" not found"
becomes
MessageFormat.format("file {0} not found", new Object[] { filename
})
. Only after this is done, can the strings be marked and extracted.
GNU gettext uses the native Java internationalization mechanism, namely
ResourceBundle
s. There are two formats of ResourceBundle
s:
.properties
files and .class
files. The .properties
format is a text file which the translators can directly edit, like PO
files, but which doesn't support plural forms. Whereas the .class
format is compiled from .java
source code and can support plural
forms (provided it is accessed through an appropriate API, see below).
To convert a PO file to a .properties
file, the msgcat
program
can be used with the option --properties-output
. To convert a
.properties
file back to a PO file, the msgcat
program can be
used with the option --properties-input
. All the tools that
manipulate PO files can work with .properties
files as well, if given
the --properties-input
and/or --properties-output
option.
To convert a PO file to a ResourceBundle class, the msgfmt
program
can be used with the option --java
or --java2
. To convert a
ResourceBundle back to a PO file, the msgunfmt
program can be used
with the option --java
.
Two different programmatic APIs can be used to access ResourceBundles. Note
that both APIs work with all kinds of ResourceBundles, whether GNU gettext
generated classes, or other .class
or .properties
files.
java.util.ResourceBundle
API.
In particular, its getString
function returns a string translation.
Note that a missing translation yields a MissingResourceException
.
This has the advantage of being the standard API. And it does not require
any additional libraries, only the msgcat
generated
.properties
files or the msgfmt
generated .class
files. But it cannot do plural handling, even if the resource was generated
by msgfmt
from a PO file with plural handling.
gnu.gettext.GettextResource
API.
Reference documentation in Javadoc 1.1 style format is in the javadoc2 directory.
Its gettext
function returns a string translation. Note that when a
translation is missing, the msgid argument is returned unchanged.
This has the advantage of having the ngettext
function for plural
handling and the pgettext
and npgettext
for strings constraint
to a particular context.
To use this API, one needs the libintl.jar
file which is part of the
GNU gettext package and distributed under the LGPL.
Four examples, using the second API, are available in the ‘examples’
directory: hello-java
, hello-java-awt
,
hello-java-swing
, hello-java-qtjambi
.
Now, to make use of the API and define a shorthand for ‘getString’, there are three idioms that you can choose from:
ResourceBundle
instance and the shorthand:
private static ResourceBundle myResources = ResourceBundle.getBundle("domain-name"); public static String _(String s) { return myResources.getString(s); } |
All classes containing internationalized strings then contain
import static Util._; |
and the shorthand is used like this:
System.out.println(_("Operation completed.")); |
ResourceBundle
instance:
public static ResourceBundle myResources = ResourceBundle.getBundle("domain-name"); |
All classes containing internationalized strings then contain
private static ResourceBundle res = Util.myResources; private static String _(String s) { return res.getString(s); } |
and the shorthand is used like this:
System.out.println(_("Operation completed.")); |
public class S { public static ResourceBundle myResources = ResourceBundle.getBundle("domain-name"); public static String _(String s) { return myResources.getString(s); } } |
and the shorthand is used like this:
System.out.println(S._("Operation completed.")); |
Which of the three idioms you choose, will depend on whether your project requires portability to Java versions prior to Java 1.5 and, if so, whether copying two lines of codes into every class is more acceptable in your project than a class with a single-letter name.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
pnet, pnetlib 0.6.2 or newer, or mono 0.29 or newer
cs
"abc"
, @"abc"
_("abc")
GettextResourceManager.GetString
,
GettextResourceManager.GetPluralString
GettextResourceManager.GetParticularString
GettextResourceManager.GetParticularPluralString
new GettextResourceManager(domain)
—, compiled message catalogs are located in subdirectories of the directory containing the executable
automatic
—
—, uses a C# specific message catalog format
xgettext -k_
String.Format "{1} {0}"
fully portable
—
Before marking strings as internationalizable, uses of the string
concatenation operator need to be converted to String.Format
invocations. For example, "file "+filename+" not found"
becomes
String.Format("file {0} not found", filename)
. Only after this is
done, can the strings be marked and extracted.
GNU gettext uses the native C#/.NET internationalization mechanism, namely
the classes ResourceManager
and ResourceSet
. Applications use
the ResourceManager
methods to retrieve the native language
translation of strings. An instance of ResourceSet
is the in-memory
representation of a message catalog file. The ResourceManager
loads
and accesses ResourceSet
instances as needed to look up the
translations.
There are two formats of ResourceSet
s that can be directly loaded by
the C# runtime: .resources
files and .dll
files.
.resources
format is a binary file usually generated through the
resgen
or monoresgen
utility, but which doesn't support plural
forms. .resources
files can also be embedded in .NET .exe
files. This only affects whether a file system access is performed to load
the message catalog; it doesn't affect the contents of the message catalog.
.dll
format is a binary file that is compiled
from .cs
source code and can support plural forms (provided it is
accessed through the GNU gettext API, see below).
Note that these .NET .dll
and .exe
files are not tied to a
particular platform; their file format and GNU gettext for C# can be used on
any platform.
To convert a PO file to a .resources
file, the msgfmt
program
can be used with the option ‘--csharp-resources’. To convert a
.resources
file back to a PO file, the msgunfmt
program can be
used with the option ‘--csharp-resources’. You can also, in some
cases, use the resgen
program (from the pnet
package) or the
monoresgen
program (from the mono
/mcs
package). These
programs can also convert a .resources
file back to a PO file. But
beware: as of this writing (January 2004), the monoresgen
converter
is quite buggy and the resgen
converter ignores the encoding of the
PO files.
To convert a PO file to a .dll
file, the msgfmt
program can be
used with the option --csharp
. The result will be a .dll
file
containing a subclass of GettextResourceSet
, which itself is a
subclass of ResourceSet
. To convert a .dll
file containing a
GettextResourceSet
subclass back to a PO file, the msgunfmt
program can be used with the option --csharp
.
The advantages of the .dll
format over the .resources
format
are:
ResourceManager
constructor provided by the system, the set of
.resources
files for an application must be specified when the
application is built and cannot be extended afterwards.
.dll
format supports the plural
handling function GetPluralString
. Whereas .resources
files
can only contain data and only support lookups that depend on a single
string.
.dll
format supports the
query-with-context functions GetParticularString
and
GetParticularPluralString
. Whereas .resources
files can only
contain data and only support lookups that depend on a single string.
GettextResourceManager
that loads the message catalogs in
.dll
format also provides for inheritance on a per-message basis.
For example, in Austrian (de_AT
) locale, translations from the German
(de
) message catalog will be used for messages not found in the
Austrian message catalog. This has the consequence that the Austrian
translators need only translate those few messages for which the translation
into Austrian differs from the German one. Whereas when working with
.resources
files, each message catalog must provide the translations
of all messages by itself.
GettextResourceManager
that loads the message catalogs in
.dll
format also provides for a fallback: The English msgid is
returned when no translation can be found. Whereas when working with
.resources
files, a language-neutral .resources
file must
explicitly be provided as a fallback.
On the side of the programmatic APIs, the programmer can use either the
standard ResourceManager
API and the GNU
GettextResourceManager
API. The latter is an extension of the
former, because GettextResourceManager
is a subclass of
ResourceManager
.
System.Resources.ResourceManager
API.
This API works with resources in .resources
format.
The creation of the ResourceManager
is done through
new ResourceManager(domainname, Assembly.GetExecutingAssembly()) |
The GetString
function returns a string's translation. Note that
this function returns null when a translation is missing (i.e. not even
found in the fallback resource file).
GNU.Gettext.GettextResourceManager
API.
This API works with resources in .dll
format.
Reference documentation is in the csharpdoc directory.
The creation of the ResourceManager
is done through
new GettextResourceManager(domainname) |
The GetString
function returns a string's translation. Note that
when a translation is missing, the msgid argument is returned
unchanged.
The GetPluralString
function returns a string translation with plural
handling, like the ngettext
function in C.
The GetParticularString
function returns a string's translation,
specific to a particular context, like the pgettext
function in C.
Note that when a translation is missing, the msgid argument is
returned unchanged.
The GetParticularPluralString
function returns a string translation,
specific to a particular context, with plural handling, like the
npgettext
function in C.
To use this API, one needs the GNU.Gettext.dll
file which is part of
the GNU gettext package and distributed under the LGPL.
You can also mix both approaches: use the
GNU.Gettext.GettextResourceManager
constructor, but otherwise use
only the ResourceManager
type and only the GetString
method.
This is appropriate when you want to profit from the tools for PO files, but
don't want to change an existing source code that uses
ResourceManager
and don't (yet) need the GetPluralString
method.
Two examples, using the second API, are available in the ‘examples’
directory: hello-csharp
, hello-csharp-forms
.
Now, to make use of the API and define a shorthand for ‘GetString’, there are two idioms that you can choose from:
ResourceManager
instance:
public static GettextResourceManager MyResourceManager = new GettextResourceManager("domain-name"); |
All classes containing internationalized strings then contain
private static GettextResourceManager Res = Util.MyResourceManager; private static String _(String s) { return Res.GetString(s); } |
and the shorthand is used like this:
Console.WriteLine(_("Operation completed.")); |
public class S { public static GettextResourceManager MyResourceManager = new GettextResourceManager("domain-name"); public static String _(String s) { return MyResourceManager.GetString(s); } } |
and the shorthand is used like this:
Console.WriteLine(S._("Operation completed.")); |
Which of the two idioms you choose, will depend on whether copying two lines of codes into every class is more acceptable in your project than a class with a single-letter name.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gawk 3.1 or newer
awk
"abc"
_"abc"
dcgettext
, missing dcngettext
in gawk-3.1.0
TEXTDOMAIN
variable
bindtextdomain
function
automatic, but missing setlocale (LC_MESSAGES, "")
in gawk-3.1.0
—
use
xgettext
printf "%2$d %1$d"
(GNU awk only)
On platforms without gettext, no translation. On non-GNU awks, you must
define dcgettext
, dcngettext
and bindtextdomain
yourself.
—
An example is available in the ‘examples’ directory: hello-gawk
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
fpk
pp
, pas
'abc'
automatic
—, use ResourceString
data type instead
—, use TranslateResourceStrings
function instead
—, use TranslateResourceStrings
function instead
automatic, but uses only LANG, not LC_MESSAGES or LC_ALL
{$mode delphi}
or {$mode objfpc}
uses gettext;
emulate partially
ppc386
followed by xgettext
or rstconv
uses sysutils;
format "%1:d %0:d"
?
—
The Pascal compiler has special support for the ResourceString
data
type. It generates a .rst
file. This is then converted to a
.pot
file by use of xgettext
or rstconv
. At runtime, a
.mo
file corresponding to translations of this .pot
file can
be loaded using the TranslateResourceStrings
function in the
gettext
unit.
An example is available in the ‘examples’ directory:
hello-pascal
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
wxGTK, gettext
cpp
"abc"
_("abc")
wxLocale::GetString
, wxGetTranslation
wxLocale::AddCatalog
wxLocale::AddCatalogLookupPathPrefix
wxLocale::Init
, wxSetLocale
#include <wx/intl.h>
emulate, see include/wx/intl.h
and src/common/intl.cpp
xgettext
wxString::Format supports positions if and only if the system has
wprintf()
, vswprintf()
functions and they support positions
according to POSIX.
fully portable
yes
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
libycp, libycp-devel, yast2-core, yast2-core-devel
ycp
"abc"
_("abc")
_()
with 1 or 3 arguments
textdomain
statement
—
—
—
use
xgettext
sformat "%2 %1"
fully portable
—
An example is available in the ‘examples’ directory: hello-ycp
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
tcl
tcl
"abc"
[_ "abc"]
::msgcat::mc
—
—, use ::msgcat::mcload
instead
automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL
package require msgcat
proc _ {s} {return [::msgcat::mc
$s]}
—, uses a Tcl specific message catalog format
xgettext -k_
format "%2\$d %1\$d"
fully portable
—
Two examples are available in the ‘examples’ directory:
hello-tcl
, hello-tcl-tk
.
Before marking strings as internationalizable, substitutions of variables
into the string need to be converted to format
applications. For
example, "file $filename not found"
becomes [format "file %s
not found" $filename]
. Only after this is done, can the strings be marked
and extracted. After marking, this example becomes [format [_ "file
%s not found"] $filename]
or [msgcat::mc "file %s not found"
$filename]
. Note that the msgcat::mc
function implicitly calls
format
when more than one argument is given.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
perl
pl
, PL
, pm
, cgi
"abc"
'abc'
qq (abc)
q (abc)
qr /abc/
qx (/bin/date)
/pattern match/
?pattern match?
s/substitution/operators/
$tied_hash{"message"}
$tied_hash_reference->{"message"}
__
(double underscore)
gettext
, dgettext
, dcgettext
, ngettext
,
dngettext
, dcngettext
textdomain
function
bindtextdomain
function
bind_textdomain_codeset
function
Use setlocale (LC_ALL, "");
use POSIX;
use Locale::TextDomain;
(included in the package
libintl-perl which is available on the Comprehensive Perl Archive Network
CPAN, http://www.cpan.org/).
platform dependent: gettext_pp emulates, gettext_xs uses GNU gettext
xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2 -kN__ -k
Both kinds of format strings support formatting with positions.
printf "%2\$d %1\$d", ...
(requires Perl 5.8.0 or newer)
__expand("[new] replaces [old]", old => $oldvalue, new =>
$newvalue)
The libintl-perl
package is platform independent but is not part of
the Perl core. The programmer is responsible for providing a dummy
implementation of the required functions if the package is not installed on
the target system.
—
Included in libintl-perl
, available on CPAN (http://www.cpan.org/).
An example is available in the ‘examples’ directory: hello-perl
.
The xgettext
parser backend for Perl differs significantly from the
parser backends for other programming languages, just as Perl itself differs
significantly from other programming languages. The Perl parser backend
offers many more string marking facilities than the other backends but it
also has some Perl specific limitations, the worst probably being its
imperfectness.
15.5.18.1 Problèmes généraux pour l'analyse syntaxique du code Perl | ||
15.5.18.2 Which keywords will xgettext look for? | Quels mots clés va rechercher xgettext ? | |
15.5.18.3 Comment extraire les clés de hachage | ||
15.5.18.4 Quelles sont les chaînes et les expressions comme les guillemets ? | ||
15.5.18.5 Invalid Uses Of String Interpolation | Interpolation de chaîne non valide | |
15.5.18.6 Valid Uses Of String Interpolation | Interpolation valide de chaîne | |
15.5.18.7 Quand utiliser des parenthèses | ||
15.5.18.8 Comment couper les lignes longues | ||
15.5.18.9 Bugs, Pitfalls, And Things That Do Not Work | Les bogues, les pièges et les autres trucs qui ne marchent pas |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
It is often heard that only Perl can parse Perl. This is not true. Perl cannot be parsed at all, it can only be executed. Perl has various built-in ambiguities that can only be resolved at runtime.
The following example may illustrate one common problem:
print gettext "Hello World!"; |
Although this example looks like a bullet-proof case of a function invocation, it is not:
open gettext, ">testfile" or die; print gettext "Hello world!" |
In this context, the string gettext
looks more like a file handle.
But not necessarily:
use Locale::Messages qw (:libintl_h); open gettext ">testfile" or die; print gettext "Hello world!"; |
Now, the file is probably syntactically incorrect, provided that the module
Locale::Messages
found first in the Perl include path exports a
function gettext
. But what if the module Locale::Messages
really looks like this?
use vars qw (*gettext); 1; |
In this case, the string gettext
will be interpreted as a file handle
again, and the above example will create a file ‘testfile’ and write
the string “Hello world!” into it. Even advanced control flow analysis
will not really help:
if (0.5 < rand) { eval "use Sane"; } else { eval "use InSane"; } print gettext "Hello world!"; |
If the module Sane
exports a function gettext
that does what
we expect, and the module InSane
opens a file for writing and
associates the handle gettext
with this output stream, we are
clueless again about what will happen at runtime. It is completely
unpredictable. The truth is that Perl has so many ways to fill its symbol
table at runtime that it is impossible to interpret a particular piece of
code without executing it.
Of course, xgettext
will not execute your Perl sources while scanning
for translatable strings, but rather use heuristics in order to guess what
you meant.
Another problem is the ambiguity of the slash and the question mark. Their interpretation depends on the context:
# A pattern match. print "OK\n" if /foobar/; # A division. print 1 / 2; # Another pattern match. print "OK\n" if ?foobar?; # Conditional. print $x ? "foo" : "bar"; |
The slash may either act as the division operator or introduce a pattern
match, whereas the question mark may act as the ternary conditional operator
or as a pattern match, too. Other programming languages like awk
present similar problems, but the consequences of a misinterpretation are
particularly nasty with Perl sources. In awk
for instance, a
statement can never exceed one line and the parser can recover from a
parsing error at the next newline and interpret the rest of the input stream
correctly. Perl is different, as a pattern match is terminated by the next
appearance of the delimiter (the slash or the question mark) in the input
stream, regardless of the semantic context. If a slash is really a division
sign but mis-interpreted as a pattern match, the rest of the input file is
most probably parsed incorrectly.
If you find that xgettext
fails to extract strings from portions of
your sources, you should therefore look out for slashes and/or question
marks preceding these sections. You may have come across a bug in
xgettext
's Perl parser (and of course you should report that bug).
In the meantime you should consider to reformulate your code in a manner
less challenging to xgettext
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Unless you instruct xgettext
otherwise by invoking it with one of the
options --keyword
or -k
, it will recognize the following
keywords in your Perl sources:
gettext
dgettext
dcgettext
ngettext:1,2
The first (singular) and the second (plural) argument will be extracted.
dngettext:1,2
The first (singular) and the second (plural) argument will be extracted.
dcngettext:1,2
The first (singular) and the second (plural) argument will be extracted.
gettext_noop
%gettext
The keys of lookups into the hash %gettext
will be extracted.
$gettext
The keys of lookups into the hash reference $gettext
will be
extracted.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Translating messages at runtime is normally performed by looking up the
original string in the translation database and returning the translated
version. The “natural” Perl implementation is a hash lookup, and, of
course, xgettext
supports such practice.
print __"Hello world!"; print $__{"Hello world!"}; print $__->{"Hello world!"}; print $$__{"Hello world!"}; |
The above four lines all do the same thing. The Perl module
Locale::TextDomain
exports by default a hash %__
that is tied
to the function __()
. It also exports a reference $__
to
%__
.
If an argument to the xgettext
option --keyword
,
resp. -k
starts with a percent sign, the rest of the keyword is
interpreted as the name of a hash. If it starts with a dollar sign, the
rest of the keyword is interpreted as a reference to a hash.
Note that you can omit the quotation marks (single or double) around the hash key (almost) whenever Perl itself allows it:
print $gettext{Error}; |
The exact rule is: You can omit the surrounding quotes, when the hash key is
a valid C (!) identifier, i.e. when it starts with an underscore or an
ASCII letter and is followed by an arbitrary number of underscores, ASCII
letters or digits. Other Unicode characters are not allowed,
regardless of the use utf8
pragma.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Perl offers a plethora of different string constructs. Those that can be
used either as arguments to functions or inside braces for hash lookups are
generally supported by xgettext
.
print gettext "Hello World!"; |
print gettext 'Hello World!'; |
print gettext qq |Hello World!|; print gettext qq <E-mail: <guido\@imperia.net>>; |
The operator qq
is fully supported. You can use arbitrary
delimiters, including the four bracketing delimiters (round, angle, square,
curly) that nest.
print gettext q |Hello World!|; print gettext q <E-mail: <guido@imperia.net>>; |
The operator q
is fully supported. You can use arbitrary delimiters,
including the four bracketing delimiters (round, angle, square, curly) that
nest.
print gettext qx ;LANGUAGE=C /bin/date; print gettext qx [/usr/bin/ls | grep '^[A-Z]*']; |
The operator qx
is fully supported. You can use arbitrary
delimiters, including the four bracketing delimiters (round, angle, square,
curly) that nest.
The example is actually a useless use of gettext
. It will invoke the
gettext
function on the output of the command specified with the
qx
operator. The feature was included in order to make the interface
consistent (the parser will extract all strings and quote-like expressions).
print gettext <<'EOF'; program not found in $PATH EOF print ngettext <<EOF, <<"EOF"; one file deleted EOF several files deleted EOF |
Here-documents are recognized. If the delimiter is enclosed in single quotes, the string is not interpolated. If it is enclosed in double quotes or has no quotes at all, the string is interpolated.
Delimiters that start with a digit are not supported!
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Perl is capable of interpolating variables into strings. This offers some nice features in localized programs but can also lead to problems.
A common error is a construct like the following:
print gettext "This is the program $0!\n"; |
Perl will interpolate at runtime the value of the variable $0
into
the argument of the gettext()
function. Hence, this argument is not
a string constant but a variable argument ($0
is a global variable
that holds the name of the Perl script being executed). The interpolation
is performed by Perl before the string argument is passed to
gettext()
and will therefore depend on the name of the script which
can only be determined at runtime. Consequently, it is almost impossible
that a translation can be looked up at runtime (except if, by accident, the
interpolated string is found in the message catalog).
The xgettext
program will therefore terminate parsing with a fatal
error if it encounters a variable inside of an extracted string. In
general, this will happen for all kinds of string interpolations that cannot
be safely performed at compile time. If you absolutely know what you are
doing, you can always circumvent this behavior:
my $know_what_i_am_doing = "This is program $0!\n"; print gettext $know_what_i_am_doing; |
Since the parser only recognizes strings and quote-like expressions, but not variables or other terms, the above construct will be accepted. You will have to find another way, however, to let your original string make it into your message catalog.
If invoked with the option --extract-all
, resp. -a
, variable
interpolation will be accepted. Rationale: You will generally use this
option in order to prepare your sources for internationalization.
Please see the manual page ‘man perlop’ for details of strings and quote-like expressions that are subject to interpolation and those that are not. Safe interpolations (that will not lead to a fatal error) are:
\t
(tab, HT, TAB), \n
(newline, NL), \r
(return, CR), \f
(form feed, FF), \b
(backspace, BS), \a
(alarm, bell, BEL), and \e
(escape, ESC).
\033
use utf8
pragma.
\x1b
\x{263a}
use utf8
pragma.
\c[
(CTRL-[)
\N{LATIN CAPITAL LETTER C WITH CEDILLA}
use utf8
pragma.
The following escapes are considered partially safe:
\l
lowercase next char
\u
uppercase next char
\L
lowercase till \E
\U
uppercase till \E
\E
end case modification
\Q
quote non-word characters till \E
These escapes are only considered safe if the string consists of ASCII
characters only. Translation of characters outside the range defined by
ASCII is locale-dependent and can actually only be performed at runtime;
xgettext
doesn't do these locale-dependent translations at extraction
time.
Except for the modifier \Q
, these translations, albeit valid, are
generally useless and only obfuscate your sources. If a translation can be
safely performed at compile time you can just as well write what you mean.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Perl is often used to generate sources for other programming languages or arbitrary file formats. Web applications that output HTML code make a prominent example for such usage.
You will often come across situations where you want to intersperse code written in the target (programming) language with translatable messages, like in the following HTML example:
print gettext <<EOF; <h1>My Homepage</h1> <script language="JavaScript"><!-- for (i = 0; i < 100; ++i) { alert ("Thank you so much for visiting my homepage!"); } //--></script> EOF |
The parser will extract the entire here document, and it will appear entirely in the resulting PO file, including the JavaScript snippet embedded in the HTML code. If you exaggerate with constructs like the above, you will run the risk that the translators of your package will look out for a less challenging project. You should consider an alternative expression here:
print <<EOF; <h1>$gettext{"My Homepage"}</h1> <script language="JavaScript"><!-- for (i = 0; i < 100; ++i) { alert ("$gettext{'Thank you so much for visiting my homepage!'}"); } //--></script> EOF |
Only the translatable portions of the code will be extracted here, and the resulting PO file will begrudgingly improve in terms of readability.
You can interpolate hash lookups in all strings or quote-like expressions that are subject to interpolation (see the manual page ‘man perlop’ for details). Double interpolation is invalid, however:
# TRANSLATORS: Replace "the earth" with the name of your planet. print gettext qq{Welcome to $gettext->{"the earth"}}; |
The qq
-quoted string is recognized as an argument to xgettext
in the first place, and checked for invalid variable interpolation. The
dollar sign of hash-dereferencing will therefore terminate the parser with
an “invalid interpolation” error.
It is valid to interpolate hash lookups in regular expressions:
if ($var =~ /$gettext{"the earth"}/) { print gettext "Match!\n"; } s/$gettext{"U. S. A."}/$gettext{"U. S. A."} $gettext{"(dial +0)"}/g; |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
In Perl, parentheses around function arguments are mostly optional.
xgettext
will always assume that all recognized keywords (except for
hashes and hash references) are names of properly prototyped functions, and
will (hopefully) only require parentheses where Perl itself requires them.
All constructs in the following example are therefore ok to use:
print gettext ("Hello World!\n"); print gettext "Hello World!\n"; print dgettext ($package => "Hello World!\n"); print dgettext $package, "Hello World!\n"; # The "fat comma" => turns the left-hand side argument into a # single-quoted string! print dgettext smellovision => "Hello World!\n"; # The following assignment only works with prototyped functions. # Otherwise, the functions will act as "greedy" list operators and # eat up all following arguments. my $anonymous_hash = { planet => gettext "earth", cakes => ngettext "one cake", "several cakes", $n, still => $works, }; # The same without fat comma: my $other_hash = { 'planet', gettext "earth", 'cakes', ngettext "one cake", "several cakes", $n, 'still', $works, }; # Parentheses are only significant for the first argument. print dngettext 'package', ("one cake", "several cakes", $n), $discarded; |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The necessity of long messages can often lead to a cumbersome or unreadable
coding style. Perl has several options that may prevent you from writing
unreadable code, and xgettext
does its best to do likewise. This is
where the dot operator (the string concatenation operator) may come in
handy:
print gettext ("This is a very long" . " message that is still" . " readable, because" . " it is split into" . " multiple lines.\n"); |
Perl is smart enough to concatenate these constant string fragments into one
long string at compile time, and so is xgettext
. You will only find
one long message in the resulting POT file.
Note that the future Perl 6 will probably use the underscore (‘_’) as
the string concatenation operator, and the dot (‘.’) for
dereferencing. This new syntax is not yet supported by xgettext
.
If embedded newline characters are not an issue, or even desired, you may also insert newline characters inside quoted strings wherever you feel like it:
print gettext ("<em>In HTML output embedded newlines are generally no problem, since adjacent whitespace is always rendered into a single space character.</em>"); |
You may also consider to use here documents:
print gettext <<EOF; <em>In HTML output embedded newlines are generally no problem, since adjacent whitespace is always rendered into a single space character.</em> EOF |
Please do not forget that the line breaks are real, i.e. they translate into newline characters that will consequently show up in the resulting POT file.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The foregoing sections should have proven that xgettext
is quite
smart in extracting translatable strings from Perl sources. Yet, some more
or less exotic constructs that could be expected to work, actually do not
work.
One of the more relevant limitations can be found in the implementation of variable interpolation inside quoted strings. Only simple hash lookups can be used there:
print <<EOF; $gettext{"The dot operator" . " does not work" . "here!"} Likewise, you cannot @{[ gettext ("interpolate function calls") ]} inside quoted strings or quote-like expressions. EOF |
This is valid Perl code and will actually trigger invocations of the
gettext
function at runtime. Yet, the Perl parser in xgettext
will fail to recognize the strings. A less obvious example can be found in
the interpolation of regular expressions:
s/<!--START_OF_WEEK-->/gettext ("Sunday")/e; |
The modifier e
will cause the substitution to be interpreted as an
evaluable statement. Consequently, at runtime the function gettext()
is called, but again, the parser fails to extract the string “Sunday”.
Use a temporary variable as a simple workaround if you really happen to need
this feature:
my $sunday = gettext "Sunday"; s/<!--START_OF_WEEK-->/$sunday/; |
Hash slices would also be handy but are not recognized:
my @weekdays = @gettext{'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'}; # Or even: @weekdays = @gettext{qw (Sunday Monday Tuesday Wednesday Thursday Friday Saturday) }; |
This is perfectly valid usage of the tied hash %gettext
but the
strings are not recognized and therefore will not be extracted.
Another caveat of the current version is its rudimentary support for non-ASCII characters in identifiers. You may encounter serious problems if you use identifiers with characters outside the range of 'A'-'Z', 'a'-'z', '0'-'9' and the underscore '_'.
Maybe some of these missing features will be implemented in future versions, but since you can always make do without them at minimal effort, these todos have very low priority.
A nasty problem are brace format strings that already contain braces as part of the normal text, for example the usage strings typically encountered in programs:
die "usage: $0 {OPTIONS} FILENAME...\n"; |
If you want to internationalize this code with Perl brace format strings, you will run into a problem:
die __x ("usage: {program} {OPTIONS} FILENAME...\n", program => $0); |
Whereas ‘{program}’ is a placeholder, ‘{OPTIONS}’ is not and
should probably be translated. Yet, there is no way to teach the Perl parser
in xgettext
to recognize the first one, and leave the other one
alone.
There are two possible work-arounds for this problem. If you are sure that
your program will run under Perl 5.8.0 or newer (these Perl versions handle
positional parameters in printf()
) or if you are sure that the
translator will not have to reorder the arguments in her translation – for
example if you have only one brace placeholder in your string, or if it
describes a syntax, like in this one –, you can mark the string as
no-perl-brace-format
and use printf()
:
# xgettext: no-perl-brace-format die sprintf ("usage: %s {OPTIONS} FILENAME...\n", $0); |
If you want to use the more portable Perl brace format, you will have to do put placeholders in place of the literal braces:
die __x ("usage: {program} {[}OPTIONS{]} FILENAME...\n", program => $0, '[' => '{', ']' => '}'); |
Perl brace format strings know no escaping mechanism. No matter how this
escaping mechanism looked like, it would either give the programmer a hard
time, make translating Perl brace format strings heavy-going, or result in a
performance penalty at runtime, when the format directives get executed.
Most of the time you will happily get along with printf()
for this
special case.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
mod_php4, mod_php4-core, phpdoc
php
, php3
, php4
"abc"
, 'abc'
_("abc")
gettext
, dgettext
, dcgettext
; starting with PHP 4.2.0
also ngettext
, dngettext
, dcngettext
textdomain
function
bindtextdomain
function
Programmer must call setlocale (LC_ALL, "")
—
use
xgettext
printf "%2\$d %1\$d"
On platforms without gettext, the functions are not available.
—
An example is available in the ‘examples’ directory: hello-php
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
roxen
pike
"abc"
—
gettext
, dgettext
, dcgettext
textdomain
function
bindtextdomain
function
setlocale
function
import Locale.Gettext;
use
—
—
On platforms without gettext, the functions are not available.
—
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gcc
c
, h
.
"abc"
_("abc")
gettext
, dgettext
, dcgettext
, ngettext
,
dngettext
, dcngettext
textdomain
function
bindtextdomain
function
Programmer must call setlocale (LC_ALL, "")
#include "intl.h"
Use
xgettext -k_
—
Uses autoconf macros
yes
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Here is a list of other data formats which can be internationalized using GNU gettext.
15.6.1 MOP (-ndt POT) Modèle d'Objet Portable (-ndt Potable Object Template) | ||
15.6.2 Table des Chaînes Ressource | ||
15.6.3 Glade - description de l'interface utilisateur GNOME |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
pot
, po
xgettext
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
fpk
rst
xgettext
, rstconv
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
glade, libglade, glade2, libglade2, intltool
glade
, glade2
xgettext
, libglade-xgettext
, xml-i18n-extract
,
intltool-extract
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
We would like to conclude this GNU gettext
manual by presenting an
history of the Translation Project so far. We finally give a few pointers
for those who want to do further research or readings about Native Language
Support matters.
16.1 Historique de GNU gettext | ||
16.2 Lectures connexes |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
gettext
Internationalization concerns and algorithms have been informally and
casually discussed for years in GNU, sometimes around GNU libc
, maybe
around the incoming Hurd
, or otherwise (nobody clearly remembers).
And even then, when the work started for real, this was somewhat
independently of these previous discussions.
This all began in July 1994, when Patrick D'Cruze had the idea and
initiative of internationalizing version 3.9.2 of GNU fileutils
. He
then asked Jim Meyering, the maintainer, how to get those changes folded
into an official release. That first draft was full of #ifdef
s and
somewhat disconcerting, and Jim wanted to find nicer ways. Patrick and Jim
shared some tries and experimentations in this area. Then, feeling that
this might eventually have a deeper impact on GNU, Jim wanted to know what
standards were, and contacted Richard Stallman, who very quickly and
verbally described an overall design for what was meant to become
glocale
, at that time.
Jim implemented glocale
and got a lot of exhausting feedback from
Patrick and Richard, of course, but also from Mitchum DSouza (who wrote a
catgets
-like package), Roland McGrath, maybe David MacKenzie,
François Pinard, and Paul Eggert, all pushing and pulling in various
directions, not always compatible, to the extent that after a couple of test
releases, glocale
was torn apart. In particular, Paul Eggert –
always keeping an eye on developments in Solaris – advocated the use of the
gettext
API over glocale
's catgets
-based API.
While Jim took some distance and time and became dad for a second time,
Roland wanted to get GNU libc
internationalized, and got Ulrich
Drepper involved in that project. Instead of starting from glocale
,
Ulrich rewrote something from scratch, but more conforming to the set of
guidelines who emerged out of the glocale
effort. Then, Ulrich got
people from the previous forum to involve themselves into this new project,
and the switch from glocale
to what was first named msgutils
,
renamed nlsutils
, and later gettext
, became officially
accepted by Richard in May 1995 or so.
Let's summarize by saying that Ulrich Drepper wrote GNU gettext
in
April 1995. The first official release of the package, including PO mode,
occurred in July 1995, and was numbered 0.7. Other people contributed to
the effort by providing a discussion forum around Ulrich, writing little
pieces of code, or testing. These are quoted in the THANKS
file
which comes with the GNU gettext
distribution.
While this was being done, François adapted half a dozen of GNU packages
to glocale
first, then later to gettext
, putting them in
pretest, so providing along the way an effective user environment for fine
tuning the evolving tools. He also took the responsibility of organizing
and coordinating the Translation Project. After nearly a year of informal
exchanges between people from many countries, translator teams started to
exist in May 1995, through the creation and support by Patrick D'Cruze of
twenty unmoderated mailing lists for that many native languages, and two
moderated lists: one for reaching all teams at once, the other for reaching
all willing maintainers of internationalized free software packages.
François also wrote PO mode in June 1995 with the collaboration of Greg
McGary, as a kind of contribution to Ulrich's package. He also gave a hand
with the GNU gettext
Texinfo manual.
In 1997, Ulrich Drepper released the GNU libc 2.0, which included the
gettext
, textdomain
and bindtextdomain
functions.
In 2000, Ulrich Drepper added plural form handling (the ngettext
function) to GNU libc. Later, in 2001, he released GNU libc 2.2.x, which is
the first free C library with full internationalization support.
Ulrich being quite busy in his role of General Maintainer of GNU libc, he
handed over the GNU gettext
maintenance to Bruno Haible in 2000.
Bruno added the plural form handling to the tools as well, added support for
UTF-8 and CJK locales, and wrote a few new tools for manipulating PO files.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
NOTE: This documentation section is outdated and needs to be revised.
Eugene H. Dorr (‘dorre@well.com’) maintains an interesting bibliography on internationalization matters, called Internationalization Reference List, which is available as:
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/i18n-books.txt |
Michael Gschwind (‘mike@vlsivie.tuwien.ac.at’) maintains a Frequently Asked Questions (FAQ) list, entitled Programming for Internationalisation. This FAQ discusses writing programs which can handle different language conventions, character sets, etc.; and is applicable to all character set encodings, with particular emphasis on ISO 8859-1. It is regularly published in Usenet groups ‘comp.unix.questions’, ‘comp.std.internat’, ‘comp.software.international’, ‘comp.lang.c’, ‘comp.windows.x’, ‘comp.std.c’, ‘comp.answers’ and ‘news.answers’. The home location of this document is:
ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/ISO-programming |
Patrick D'Cruze (‘pdcruze@li.org’) wrote a tutorial about NLS matters, and Jochen Hein (‘Hein@student.tu-clausthal.de’) took over the responsibility of maintaining it. It may be found as:
ftp://sunsite.unc.edu/pub/Linux/utils/nls/catalogs/Incoming/... ...locale-tutorial-0.8.txt.gz |
This site is mirrored in:
ftp://ftp.ibp.fr/pub/linux/sunsite/ |
A French version of the same tutorial should be findable at:
ftp://ftp.ibp.fr/pub/linux/french/docs/ |
together with French translations of many Linux-related documents.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The ISO 639 standard defines two-letter codes for many languages, and three-letter codes for more rarely used languages. All abbreviations for languages used in the Translation Project should come from this standard.
A.1 Les codes de langage habituels | Le code de langue IOS 639 Ã deux chiffres | |
A.2 Les codes de langues rares | Les codes de langues ISO 639 Ã trois chiffres |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
For the commonly used languages, the ISO 639-1 standard defines two-letter codes.
Afar.
Abkhazian.
Adangme.
Avestan.
Afrikaans.
Akan.
Amharic.
Aragonese.
Arabic.
Assamese.
Avaric.
Aymara.
Azerbaijani.
Bashkir.
Byelorussian; Belarusian.
Bulgarian.
Bihari.
Bislama.
Bambara.
Bengali; Bangla.
Tibetan.
Breton.
Bosnian.
Catalan.
Chechen.
Chamorro.
Corsican.
Cree.
Czech.
Church Slavic.
Chuvash.
Welsh.
Danish.
German.
Divehi; Maldivian.
Dzongkha; Bhutani.
Éwé.
Greek.
English.
Esperanto.
Spanish.
Estonian.
Basque.
Persian.
Fulah.
Finnish.
Fijian; Fiji.
Faroese.
French.
Western Frisian.
Irish.
Scots; Gaelic.
Galician.
Guarani.
Gujarati.
Manx.
Hausa.
Hebrew (formerly iw).
Hindi.
Hiri Motu.
Croatian.
Haitian; Haitian Creole.
Hungarian.
Armenian.
Herero.
Interlingua.
Indonesian (formerly in).
Interlingue.
Igbo.
Sichuan Yi.
Inupiak; Inupiaq.
Ido.
Icelandic.
Italian.
Inuktitut.
Japanese.
Javanese.
Georgian.
Kongo.
Kikuyu; Gikuyu.
Kuanyama; Kwanyama.
Kazakh.
Kalaallisut; Greenlandic.
Khmer; Cambodian.
Kannada.
Korean.
Kanuri.
Kashmiri.
Kurdish.
Komi.
Cornish.
Kirghiz.
Latin.
Letzeburgesch; Luxembourgish.
Ganda.
Limburgish; Limburger; Limburgan.
Lingala.
Lao; Laotian.
Lithuanian.
Luba-Katanga.
Latvian; Lettish.
Malagasy.
Marshallese.
Maori.
Macedonian.
Malayalam.
Mongolian.
Moldavian.
Marathi.
Malay.
Maltese.
Burmese.
Nauru.
Norwegian Bokmål.
Ndebele, North.
Nepali.
Ndonga.
Dutch.
Norwegian Nynorsk.
Norwegian.
Ndebele, South.
Navajo; Navaho.
Chichewa; Nyanja.
Occitan; Provençal.
Ojibwa.
(Afan) Oromo.
Oriya.
Ossetian; Ossetic.
Panjabi; Punjabi.
Pali.
Polish.
Pashto, Pushto.
Portuguese.
Quechua.
Rhaeto-Romance.
Rundi; Kirundi.
Romanian.
Russian.
Kinyarwanda.
Sanskrit.
Sardinian.
Sindhi.
Northern Sami.
Sango; Sangro.
Sinhala; Sinhalese.
Slovak.
Slovenian.
Samoan.
Shona.
Somali.
Albanian.
Serbian.
Swati; Siswati.
Sesotho; Sotho, Southern.
Sundanese.
Swedish.
Swahili.
Tamil.
Telugu.
Tajik.
Thai.
Tigrinya.
Turkmen.
Tagalog.
Tswana; Setswana.
Tonga.
Turkish.
Tsonga.
Tatar.
Twi.
Tahitian.
Uighur.
Ukrainian.
Urdu.
Uzbek.
Venda.
Vietnamese.
Volapük; Volapuk.
Walloon.
Wolof.
Xhosa.
Yiddish (formerly ji).
Yoruba.
Zhuang.
Chinese.
Zulu.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
For rarely used languages, the ISO 639-2 standard defines three-letter codes. Here is the current list, reduced to only living languages with at least one million of speakers.
Achinese.
Awadhi.
Banda.
Baluchi.
Balinese.
Bemba.
Bhojpuri.
Bikol.
Bini.
Batak (Indonesia).
Buginese.
Cebuano.
Dinka.
Dogri.
Filipino; Pilipino.
Fon.
Gondi.
Alemani; Swiss German.
Hiligaynon.
Hmong.
Iloko.
Kabyle.
Kamba.
Kabardian.
Kimbundu.
Konkani.
Kurukh.
Luba-Lulua.
Luo (Kenya and Tanzania).
Madurese.
Magahi.
Maithili.
Makasar.
Mandingo.
Mende.
Minangkabau.
Manipuri.
Mossi.
Marwari.
Neapolitan.
Pedi; Sepedi; Northern Sotho.
Nyamwezi.
Nyankole.
Pangasinan.
Pampanga.
Rajasthani.
Sasak.
Santali.
Sicilian.
Shan.
Sidamo.
Serer.
Sukuma.
Susu.
Timne.
Tiv.
Tumbuka.
Umbundu.
Walamo.
Waray.
Yao.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The ISO 3166 standard defines two character codes for many countries and territories. All abbreviations for countries used in the Translation Project should come from this standard.
Andorra.
United Arab Emirates.
Afghanistan.
Antigua and Barbuda.
Anguilla.
Albania.
Armenia.
Netherlands Antilles.
Angola.
Antarctica.
Argentina.
Samoa (American).
Austria.
Australia.
Aruba.
Aaland Islands.
Azerbaijan.
Bosnia and Herzegovina.
Barbados.
Bangladesh.
Belgium.
Burkina Faso.
Bulgaria.
Bahrain.
Burundi.
Benin.
Bermuda.
Brunei.
Bolivia.
Brazil.
Bahamas.
Bhutan.
Bouvet Island.
Botswana.
Belarus.
Belize.
Canada.
Cocos (Keeling) Islands.
Congo (Dem. Rep.).
Central African Republic.
Congo (Rep.).
Switzerland.
Côte d'Ivoire.
Cook Islands.
Chile.
Cameroon.
China.
Colombia.
Costa Rica.
Cuba.
Cape Verde.
Christmas Island.
Cyprus.
Czech Republic.
Germany.
Djibouti.
Denmark.
Dominica.
Dominican Republic.
Algeria.
Ecuador.
Estonia.
Egypt.
Western Sahara.
Eritrea.
Spain.
Ethiopia.
Finland.
Fiji.
Falkland Islands.
Micronesia.
Faeroe Islands.
France.
Gabon.
Britain (United Kingdom).
Grenada.
Georgia.
French Guiana.
Guernsey.
Ghana.
Gibraltar.
Greenland.
Gambia.
Guinea.
Guadeloupe.
Equatorial Guinea.
Greece.
South Georgia and the South Sandwich Islands.
Guatemala.
Guam.
Guinea-Bissau.
Guyana.
Hong Kong.
Heard Island and McDonald Islands.
Honduras.
Croatia.
Haiti.
Hungary.
Indonesia.
Ireland.
Israel.
Isle of Man.
India.
British Indian Ocean Territory.
Iraq.
Iran.
Iceland.
Italy.
Jersey.
Jamaica.
Jordan.
Japan.
Kenya.
Kyrgyzstan.
Cambodia.
Kiribati.
Comoros.
St Kitts and Nevis.
Korea (North).
Korea (South).
Kuwait.
Cayman Islands.
Kazakhstan.
Laos.
Lebanon.
St Lucia.
Liechtenstein.
Sri Lanka.
Liberia.
Lesotho.
Lithuania.
Luxembourg.
Latvia.
Libya.
Morocco.
Monaco.
Moldova.
Montenegro.
Madagascar.
Marshall Islands.
Macedonia.
Mali.
Myanmar (Burma).
Mongolia.
Macao.
Northern Mariana Islands.
Martinique.
Mauritania.
Montserrat.
Malta.
Mauritius.
Maldives.
Malawi.
Mexico.
Malaysia.
Mozambique.
Namibia.
New Caledonia.
Niger.
Norfolk Island.
Nigeria.
Nicaragua.
Netherlands.
Norway.
Nepal.
Nauru.
Niue.
New Zealand.
Oman.
Panama.
Peru.
French Polynesia.
Papua New Guinea.
Philippines.
Pakistan.
Poland.
St Pierre and Miquelon.
Pitcairn.
Puerto Rico.
Palestine.
Portugal.
Palau.
Paraguay.
Qatar.
Reunion.
Romania.
Serbia.
Russia.
Rwanda.
Saudi Arabia.
Solomon Islands.
Seychelles.
Sudan.
Sweden.
Singapore.
St Helena.
Slovenia.
Svalbard and Jan Mayen.
Slovakia.
Sierra Leone.
San Marino.
Senegal.
Somalia.
Suriname.
Sao Tome and Principe.
El Salvador.
Syria.
Swaziland.
Turks and Caicos Islands.
Chad.
French Southern and Antarctic Lands.
Togo.
Thailand.
Tajikistan.
Tokelau.
Timor-Leste.
Turkmenistan.
Tunisia.
Tonga.
Turkey.
Trinidad and Tobago.
Tuvalu.
Taiwan.
Tanzania.
Ukraine.
Uganda.
US minor outlying islands.
United States.
Uruguay.
Uzbekistan.
Vatican City.
St Vincent and the Grenadines.
Venezuela.
Virgin Islands (UK).
Virgin Islands (US).
Vietnam.
Vanuatu.
Wallis and Futuna.
Samoa (Western).
Yemen.
Mayotte.
South Africa.
Zambia.
Zimbabwe.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The files of this package are covered by the licenses indicated in each particular file or directory. Here is a summary:
libintl
and libasprintf
libraries are covered by the GNU
Library General Public License (LGPL). A copy of the license is included in
@ref{GNU LGPL}.
libgettextpo
library
are covered by the GNU General Public License (GPL). A copy of the license
is included in @ref{GNU GPL}.
C.1 GNU GENERAL PUBLIC LICENSE | License Publique Générale GNU | |
C.2 GNU LESSER GENERAL PUBLIC LICENSE | License GNU Publique un peu Moins Générale | |
C.3 Licence Documentation Libre GNU |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Version 2, June 1991
Copyright © 1989, 1991 Free Software Foundation, Inc. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software—to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program.
In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and “any later version”, you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
one line to give the program's name and a brief idea of what it does. Copyright (C) yyyy name of author This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. |
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. |
The hypothetical commands ‘show w’ and ‘show c’ should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than ‘show w’ and ‘show c’; they could even be mouse-clicks or menu items—whatever suits your program.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a “copyright disclaimer” for the program, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. signature of Ty Coon, 1 April 1989 Ty Coon, President of Vice |
This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Version 2.1, February 1999
Copyright © 1991, 1999 Free Software Foundation, Inc. 51 Franklin St -- Fifth Floor, Boston, MA 02110-1301, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the Lesser GPL. It also counts as the successor of the GNU Library Public License, version 2, hence the version number 2.1.] |
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software—to make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some specially designated software—typically libraries—of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things.
To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the library, and (2) we offer you this license, which gives you legal permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others.
Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license.
Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs.
When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library.
We call this license the Lesser General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances.
For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License.
In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system.
Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library.
The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a “work based on the library” and a “work that uses the library”. The former contains code derived from the library, whereas the latter must be combined with the library in order to run.
A “library” means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables.
The “Library”, below, refers to any such software library or work which has been distributed under these terms. A “work based on the Library” means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term “modification”.)
“Source code” for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library.
Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does.
You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee.
(For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.)
These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library.
In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License.
Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of the Library into a program that is not a library.
If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code.
However, linking a “work that uses the Library” with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a “work that uses the library”. The executable is therefore covered by this License. Section 6 states terms for distribution of such executables.
When a “work that uses the Library” uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself.
You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things:
For an executable, the required form of the “work that uses the Library” must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable.
It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice.
This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License.
Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and “any later version”, you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License).
To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the “copyright” line and a pointer to where the full notice is found.
one line to give the library's name and an idea of what it does. Copyright (C) year name of author This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. |
Also add information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a “copyright disclaimer” for the library, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob' (a library for tweaking knobs) written by James Random Hacker. signature of Ty Coon, 1 April 1990 Ty Coon, President of Vice |
That's all there is to it!
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Version 1.2, November 2002
Copyright © 2000,2001,2002 Free Software Foundation, Inc. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. |
The purpose of this License is to make a manual, textbook, or other functional and useful document free in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The “Document”, below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as “you”. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.
A “Modified Version” of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A “Secondary Section” is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, “Title Page” means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.
A section “Entitled XYZ” means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To “Preserve the Title” of such a section when you modify the Document means that it remains a section “Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various original documents, forming one section Entitled “History”; likewise combine any sections Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete all sections Entitled “Endorsements.”
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an “aggregate” if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document's Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “History”, the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.
You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License “or any later version” applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
Copyright (C) year your name. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''. |
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “with...Texts.” line with this:
with the Invariant Sections being list their titles, with the Front-Cover Texts being list, and with the Back-Cover Texts being list. |
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Aller à: | A E G M N R X |
---|
Aller à: | A E G M N R X |
---|
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Aller à: | - |
---|
Aller à: | - |
---|
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Aller à: | G L M P T |
---|
Aller à: | G L M P T |
---|
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Aller à: | #
.
0
<
=
>
?
_
A C D E F H I K L M N O P Q R S T U V W X Y Z |
---|
Aller à: | #
.
0
<
=
>
?
_
A C D E F H I K L M N O P Q R S T U V W X Y Z |
---|
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Aller à: | A |
---|
Aller à: | A |
---|
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Aller à: | _
Ã
A B C D E F G H I J K L M N O P Q R S T U V W X Y |
---|
Aller à: | _
Ã
A B C D E F G H I J K L M N O P Q R S T U V W X Y |
---|
[Top] | [Table des matières] | [Index] | [ ? ] |
Dans ce manuel, toutes les mentions d'Emacs réfèrent aussi bien à GNU Emacs qu'à XEmacs, que les gens appèlent parfois respectivement FSF Emacs et Lucid Emacs.
Cette limitation n'est pas imposée par GNU
gettext
, mais c'est pour la compatibilité avec l'implémentation de
msgfmt
sur Solaris.
Some system,
e.g. mingw, don't have LC_MESSAGES
. Here we use a more or less
arbitrary value for it, namely 1729, the smallest positive integer which can
be represented in two different ways as the sum of two cubes.
When the system does not support
setlocale
its behavior in setting the locale values is simulated by
looking at the environment variables.
Additions are welcome. Send appropriate information to bug-gnu-gettext@gnu.org and bug-glibc-manual@gnu.org.
[Top] | [Table des matières] | [Index] | [ ? ] |
gettext
gettext
gettext
msgcat
msgconv
msggrep
msgfilter
msguniq
msgcomm
msgcmp
msgattrib
msgen
msgexec
msgfmt
msgunfmt
gettextize
gettext.sh
gettext
ngettext
envsubst
eval_gettext
eval_ngettext
[Top] | [Table des matières] | [Index] | [ ? ] |
Ce document a été généré par Mathieu le 11 Mai 2009 en utilisant texi2html 1.78.
Les boutons de navigation ont la signification suivante :
Bouton | Nom | Aller à | Depuis 1.2.3 aller à |
---|---|---|---|
[ < ] | Retour | Section précédente dans l'ordre de lecture | 1.2.2 |
[ > ] | Avant | Section suivante dans l'ordre de lecture | 1.2.4 |
[ << ] | RetourRapide | Début de ce chapitre ou chapitre précédent | 1 |
[Plus haut] | Monter | Section supérieure | 1.2 |
[ >> ] | AvanceRapide | Chapitre suivant | 2 |
[Top] | Top | Couverture (top) du document | |
[Table des matières] | Table des matières | Table des matières | |
[Index] | Index | Index | |
[ ? ] | A propos | A propos (page d'aide) |
Dans cet exemple on est à Sous section un-deux-trois dans un document dont la structure est :
Ce document a été généré par Mathieu le 11 Mai 2009 en utilisant texi2html 1.78.