Structural changes - gettext?
This discussion is connected to the gimp-docs-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.
This is a read-only list on gimpusers.com so this discussion thread is read-only, too.
Structural changes -> gettext? | Roman Joost | 23 May 15:00 |
Structural changes -> gettext? | Choi, JiHui | 24 May 06:53 |
Structural changes -> gettext? | Roman Joost | 25 May 18:29 |
Structural changes -> gettext? | Marco Ciampa | 25 May 15:47 |
Structural changes -> gettext? | Alessandro Falappa | 26 May 11:05 |
Structural changes -> gettext? | julien | 27 May 07:35 |
Structural changes -> gettext? | Kolbjørn Stuestøl | 03 Jun 12:41 |
Structural changes -> gettext? | Alessandro Falappa | 03 Jun 14:22 |
Structural changes -> gettext? | Alessandro Falappa | 03 Jun 14:30 |
Structural changes -> gettext? | Kolbjørn Stuestøl | 03 Jun 23:00 |
Structural changes -> gettext? | Joao S. O. Bueno | 03 Jun 14:58 |
Structural changes - gettext? | Joao S. O. Bueno | 03 Jun 15:18 |
Structural changes - gettext? | Marco Ciampa | 09 Jun 11:30 |
Structural changes - gettext? | Joao S. O. Bueno | 10 Jun 02:41 |
Structural changes - gettext? | julien | 10 Jun 18:37 |
Structural changes - gettext? | Joao S. O. Bueno | 12 Jun 03:47 |
Structural changes - gettext? | Kolbjørn Stuestøl | 10 Jun 16:54 |
Structural changes -> gettext?
Hi folks,
after LGM with a lot of feedback, I would like to propose a structural change on how we're currently write documentation.
I want to propose to use gettext for translations and use only one reference language for our manual. My last inquiry on the mailing list showed, that most of you were really annoyed by the mixed up languages in the XML files. Back in 2003, it was a new attempt of writing documentation, but it turned out to be hardly maintainable with a big learning curve for new contributors.
Why gettext? It is currently a widespread technique to provide translations for documentation. The free software community provides a wide set of usable tools for translators.
Of course, I'm fully aware of the possible drawbacks of this approach. We need a migration strategy to migrate the current content and set a reference language. I'd propose English, because every author who want to introduce new content need to write it in English first and translate it afterwards.
After this migration, we concentrate only on one manual (instead of 8 or 12 currently) and we will be able get new contributors (because of the lower barrier and people who already provide translation for gnome-docs can also provide translations to our manual). We have to roles involved in the creation process: authors and translators. Every contributor can decide on which side he want to be. Even both will work, like currently...
I would very welcome to get a lot of feedback from you. The wiki would probably the best place to write down a migration strategy (what will happen with the screenshots for example?)
PS: See also my blog entry about my thoughts on this: http://romanofskiat.wordpress.com/2008/05/19/lgm-aftermathnowlgm-aftermath/
Greetings,
Structural changes -> gettext?
I'm really really happy for this news.
I've spent a lot of time(more than pure translation) for understanding
and using docbook and xml
and now I'm still trying.(for making odf, I can't solve)
I agree at all. and I'll translate only, you know my english is not good. and when will we start, if it is decided?
Structural changes -> gettext?
On Fri, May 23, 2008 at 03:00:53PM +0200, Roman Joost wrote:
Hi folks,
after LGM with a lot of feedback, I would like to propose a structural change on how we're currently write documentation.
I want to propose to use gettext for translations and use only one reference language for our manual. My last inquiry on the mailing list showed, that most of you were really annoyed by the mixed up languages in the XML files. Back in 2003, it was a new attempt of writing documentation, but it turned out to be hardly maintainable with a big learning curve for new contributors.
hip hip hurrah!
Why gettext?
It is currently a widespread technique to provide translations for documentation. The free software community provides a wide set of usable tools for translators.Of course, I'm fully aware of the possible drawbacks of this approach. We need a migration strategy to migrate the current content and set a reference language. I'd propose English, because every author who want to introduce new content need to write it in English first and translate it afterwards.
100% agree
After this migration, we concentrate only on one manual (instead of 8 or 12 currently) and we will be able get new contributors (because of the lower barrier and people who already provide translation for gnome-docs can also provide translations to our manual). We have to roles involved in the creation process: authors and translators. Every contributor can decide on which side he want to be. Even both will work, like currently...
I'm mostly a translator but I'll do my best to help the migration process...
I would very welcome to get a lot of feedback from you. The wiki would probably the best place to write down a migration strategy (what will happen with the screenshots for example?)
PS: See also my blog entry about my thoughts on this: http://romanofskiat.wordpress.com/2008/05/19/lgm-aftermathnowlgm-aftermath/
Greetings,
Structural changes -> gettext?
On Sat, May 24, 2008 at 01:53:32PM +0900, Choi, JiHui wrote:
I'm really really happy for this news. I've spent a lot of time(more than pure translation) for understanding and using docbook and xml
and now I'm still trying.(for making odf, I can't solve)I agree at all. and I'll translate only, you know my english is not good. and when will we start, if it is decided?
As soon as possible. I hope I'll have a few minutes to setup a migration strategy on the wiki :)
Greetings,
Structural changes -> gettext?
Roman Joost ha scritto:
Hi folks,
...
I want to propose to use gettext for translations and use only one reference language for our manual. My last inquiry on the mailing list showed, that most of you were really annoyed by the mixed up languages in the XML files. Back in 2003, it was a new attempt of writing documentation, but it turned out to be hardly maintainable with a big learning curve for new contributors.
I would cast my vote for gettext
...
Of course, I'm fully aware of the possible drawbacks of this approach. We need a migration strategy to migrate the current content and set a reference language. I'd propose English, because every author who want to introduce new content need to write it in English first and translate it afterwards.
I think it's a reasonable choice
Greetings
Structural changes -> gettext?
A new adventure :-). Go!
Julien
Structural changes -> gettext?
Roman Joost wrote:
Hi folks,
after LGM with a lot of feedback, I would like to propose a structural change on how we're currently write documentation.
I want to propose to use gettext for translations and use only one reference language for our manual. My last inquiry on the mailing list showed, that most of you were really annoyed by the mixed up languages in the XML files. Back in 2003, it was a new attempt of writing documentation, but it turned out to be hardly maintainable with a big learning curve for new contributors.
Why gettext? It is currently a widespread technique to provide translations for documentation. The free software community provides a wide set of usable tools for translators.
Of course, I'm fully aware of the possible drawbacks of this approach. We need a migration strategy to migrate the current content and set a reference language. I'd propose English, because every author who want to introduce new content need to write it in English first and translate it afterwards.
After this migration, we concentrate only on one manual (instead of 8 or 12 currently) and we will be able get new contributors (because of the lower barrier and people who already provide translation for gnome-docs can also provide translations to our manual). We have to roles involved in the creation process: authors and translators. Every contributor can decide on which side he want to be. Even both will work, like currently...
I would very welcome to get a lot of feedback from you. The wiki would probably the best place to write down a migration strategy (what will happen with the screenshots for example?)
PS: See also my blog entry about my thoughts on this: http://romanofskiat.wordpress.com/2008/05/19/lgm-aftermathnowlgm-aftermath/
Greetings,
According to the applause from the other writers, it seems that gettext
will be fine.
I have no knowledge of gettext, but understand that it is much the same
as poEdit which I am using to edit po(t) files and creating mo files.
If that's the case, it is only possible to edit one paragraph at a time.
In my opinion a great drawback. When translating I use the other
languages as a reference (if I understand them of course). Is this not
possible with gettext? poEdit (gettext?) is OK for shorter paragraphs,
but somewhat frustrating when you need to read several lines of text,
perhaps together with some text in the previous para.
Another problem: I'm still using Windows. If I can't use poEdit, is
there a simple way to install and run gettext on this OS? Something like
the installer for GIMP. If I need some extra packages, which?
I looked at "http://gnuwin32.sourceforge.net/install.html" but found
this site too fuzzy at the first glance. But perhaps I'll study it more
seriously one rainy day in the future ;-) and get some better
understanding.
I'm using CygWin for my translation work on GIMP, if that matter.
Have a nice day
Kolbj?rn
Structural changes -> gettext?
Kolbj?rn Stuest?l ha scritto:
According to the applause from the other writers, it seems that gettext will be fine.
I have no knowledge of gettext, but understand that it is much the same as poEdit which I am using to edit po(t) files and creating mo files. If that's the case, it is only possible to edit one paragraph at a time. In my opinion a great drawback. When translating I use the other languages as a reference (if I understand them of course). Is this not possible with gettext? poEdit (gettext?) is OK for shorter paragraphs, but somewhat frustrating when you need to read several lines of text, perhaps together with some text in the previous para.
I don't know for sure but I guess that related paragraph will stay close each other in the resulting po file so understanding the context should be possible.
Another problem: I'm still using Windows. If I can't use poEdit, is there a simple way to install and run gettext on this OS? Something like
You can surely use PoEdit, or a bare text editor, to edit po files but I am afraid that you will need to have a gnu toolchain to actually build the manual (get html out of the english original and translated po file).
the installer for GIMP. If I need some extra packages, which? I looked at "http://gnuwin32.sourceforge.net/install.html" but found this site too fuzzy at the first glance. But perhaps I'll study it more seriously one rainy day in the future ;-) and get some better understanding.
I'm using CygWin for my translation work on GIMP, if that matter. Have a nice day
Kolbj?rn
Greets
Structural changes -> gettext?
Kolbj?rn Stuest?l ha scritto:
Roman Joost wrote:
...
I'm using CygWin for my translation work on GIMP, if that matter.
Just noticed that gettext is available as a package for gygwin (see http://cygwin.com/packages/) so I guess that all you need to do is to launch the cygwin setup.exe, download and install it.
Have a nice day
Cheers
Structural changes -> gettext?
On Tuesday 03 June 2008, Kolbj?rn Stuest?l wrote:
Roman Joost wrote:
Hi folks,
after LGM with a lot of feedback, I would like to propose a structural change on how we're currently write documentation.
I want to propose to use gettext for translations and use only one reference language for our manual. My last inquiry on the mailing list showed, that most of you were really annoyed by the mixed up languages in the XML files. Back in 2003, it was a new attempt of writing documentation, but it turned out to be hardly maintainable with a big learning curve for new contributors.
Why gettext? It is currently a widespread technique to provide translations for documentation. The free software community provides a wide set of usable tools for translators.
Of course, I'm fully aware of the possible drawbacks of this approach. We need a migration strategy to migrate the current content and set a reference language. I'd propose English, because every author who want to introduce new content need to write it in English first and translate it afterwards.
After this migration, we concentrate only on one manual (instead of 8 or 12 currently) and we will be able get new contributors (because of the lower barrier and people who already provide translation for gnome-docs can also provide translations to our manual). We have to roles involved in the creation process: authors and translators. Every contributor can decide on which side he want to be. Even both will work, like currently...
I would very welcome to get a lot of feedback from you. The wiki would probably the best place to write down a migration strategy (what will happen with the screenshots for example?)
PS: See also my blog entry about my thoughts on this:
http://romanofskiat.wordpress.com/2008/05/19/lgm-aftermathnowlgm- aftermath/Greetings,
According to the applause from the other writers, it seems that gettext will be fine.
I have no knowledge of gettext, but understand that it is much the same as poEdit which I am using to edit po(t) files and creating mo files. If that's the case, it is only possible to edit one paragraph at a time. In my opinion a great drawback. When translating I use the other languages as a reference (if I understand them of course). Is this not possible with gettext? poEdit (gettext?) is OK for shorter paragraphs, but somewhat frustrating when you need to read several lines of text, perhaps together with some text in the previous para. Another problem: I'm still using Windows. If I can't use poEdit, is there a simple way to install and run gettext on this OS? Something like the installer for GIMP. If I need some extra packages, which? I looked at
"http://gnuwin32.sourceforge.net/install.html" but found this site too fuzzy at the first glance. But perhaps I'll study it more seriously one rainy day in the future ;-) and get some better understanding.
I'm using CygWin for my translation work on GIMP, if that matter. Have a nice day
Kolbj?rn
Gettext isa the underliying system that extarcts the l.PO files you
edit in POedit.
POedit, however is just an editorthat edits po files, with an
itnerfade that just show you one paragraph ata a time. Once you have
the .po file you don't have to edit it in poedit - any text editor
will do - in there you willl be able to see the whole text and use it
as a basis for your trasnlation, however, unlie inside POedit, you
will have to be carefull to preserve the .po file markings (quoting
every line, for example).
regards,
js ->
Structural changes - gettext?
Hi Roman, and everyone.
Sorry for not commenting here earlier. I've been at poland where Roman took the decision to change our trasnaltion proccess.
I Surely welcome teh change. We have not been able to start a pt_BR version of the manual dual to the entry barriers cited before. (i.e. dealing with files with several languages)
The new system will separate the docbook files in several .xml files, and the current plans are using the egttext tool chain to make it possible to translate a master file in the reference language (English - any other would not make sense, since this is the common language to all trasnaltors/authors) to other langauges. As far as I know, te system that is easier to be put in place will then just convert the .po files tarsnalted in this way back to .po files.
But, in fact, there are more drawbacks in using gettext and .po's than just having a master language. The major of them seens to be the need of an in order, paragraph by paragraph translation that might not always be the best possible translation - and simply take too much freedom away from the translators.
However, if the gettext system generates xml files from the PO's for the other languages as part of the build (Roman, can you confirm that?) I thib k that soon after we change the structure, we might figure out a tool chain to bypass the gettext->PO->xml steps, and do a en.xml -> target_language.xml translation that could preserve most of the freedom that existed before the change. We will need, for example, an editor that enables editing the [.en and target language] XML files side by side we could be better than using plain gettext.
Note that a necessary step is the spliting of the files into diferent languages and the adoption of a "master language". But in doing this, some freedom would remain, and the process of trasnaltion itself would flow better. Most .po editing tools only allow the transaltor to see one paragraph at a time, which, as Kolbj?rn noted, is very inconveninent for translation. Editing the.PO file sin raw would have the inconvenient of requiring the translator to place teh quotes in eery line (for most editors). So we'd require a custom .po editor, and still be bound to the order of the paragraphs. Using straight XML for the translations would most likely require a custom editor anyway, however a lot of good editors allow one to split windwos so he can see the two files at once - but most importantly, would not bind the tarnsaltors to the paragraph order.
This doe snot prevent us from first, changing the structure to a chain requiring gettext (because such a change might be easier to do), than on a second moemnt finding/creating an apropriate XML editor taht would allow us to bypass gettext->po, and work with a xml->xml chain.
What do you think?
js ->
Structural changes -> gettext?
Alessandro Falappa skreiv:
Kolbj?rn Stuest?l ha scritto:
Roman Joost wrote:
...
I'm using CygWin for my translation work on GIMP, if that matter.
Just noticed that gettext is available as a package for gygwin (see http://cygwin.com/packages/) so I guess that all you need to do is to launch the cygwin setup.exe, download and install it.
Have a nice day
Cheers
Thank you all :-)
And thank you for using your time on this question.
I'll try the suggestions you mentioned. Seems that I have to study
gettext more carefully sometime.
Kolbj?rn
Structural changes - gettext?
On Tue, Jun 03, 2008 at 10:18:59AM -0300, Joao S. O. Bueno wrote:
Hi Roman, and everyone.
Sorry for not commenting here earlier. I've been at poland where Roman took the decision to change our trasnaltion proccess.
I was not away but I am late too....
I Surely welcome teh change. We have not been able to start a pt_BR version of the manual dual to the entry barriers cited before. (i.e. dealing with files with several languages)
The new system will separate the docbook files in several .xml files, and the current plans are using the egttext tool chain to make it possible to translate a master file in the reference language (English - any other would not make sense, since this is the common language to all trasnaltors/authors) to other langauges. As far as I know, te system that is easier to be put in place will then just convert the .po files tarsnalted in this way back to .po files.
Yes I agree.
But, in fact, there are more drawbacks in using gettext and .po's than just having a master language. The major of them seens to be the need of an in order, paragraph by paragraph translation that might not always be the best possible translation - and simply take too much freedom away from the translators.
This drawback does not exist (IMHO) since a gettext string could fit easily more paragraphs at a time. If you go and look at the manual structure, for the largest part of it there _is_ a strict correspondence between the english parargraps (or grouped paragraphs) and the other languages sections.
However, if the gettext system generates xml files from the PO's for the other languages as part of the build (Roman, can you confirm that?) I think that soon after we change the structure, we might figure out a tool chain to bypass the gettext->PO->xml steps, and do a en.xml -> target_language.xml translation that could preserve most of the freedom that existed before the change.
That freedom (IMHO) is not necessary nor desiderable.
We will need, for example, an editor that enables editing the [.en and target language] XML files side by side we could be better than using plain gettext.
kbabel, poedit or even emacs in po-mode have planty of options so useful for translating that I really do not understand why should be desiderable to translate xml instead of po files...
Note that a necessary step is the spliting of the files into diferent languages and the adoption of a "master language". But in doing this, some freedom would remain, and the process of trasnaltion itself would flow better. Most .po editing tools only allow the transaltor to see one paragraph at a time, which, as Kolbj?rn noted, is very inconveninent for translation.
Kolbj?rn (sorry hope this do not offend Kolbj?rn) is not really aware about the workflow of gettext so do not take him as an example...
Editing the.PO file sin raw would have the inconvenient of requiring the translator to place the quotes in every line (for most editors).
Not .po files editors...and consider that using plain .po files you (potentially) open the door to much more friendly translation tools like rosetta or pootle
So we'd require a custom .po editor, and still be bound to the order of the paragraphs.
mmm I still think we do not need a custom .po editor...
Using straight XML for the translations would most likely require a custom editor anyway, however a lot of good editors allow one to split windwos so he can see the two files at once - but most importantly, would not bind the translators to the paragraph order.
poedit, emacs, kbabel have all the possibility to show the piece of code that the string refers to...this shouldn't be enougth?
This does not prevent us from first, changing the structure to a chain requiring gettext (because such a change might be easier to do), than on a second moemnt finding/creating an apropriate XML editor that would allow us to bypass gettext->po, and work with a xml->xml chain.
mmm
What do you think?
I think that we should concentrate on how to ectract .po file from the xml to re-merge the strings onto the main .xml files. Using the .po files is _much_ more powerful and then we should consider only the problem(s) that refers to this approach.
One real problem could be that a single correction in the english strings (a missing comma for example) invalidate (fuzzy) many languages strings. And the default behaviour is to use english when a string is fuzzy or missing.
A solution could be to:
1. do not use the english strings when the string is missing but leaving simply a marker that eventually points to the english version to the paragraph(s) or using the default behaviour ... here I would like to know how you all think about this...
2. fuzzy strings are considered as missing and _this_ could be really a drag. One mechanism for avoiding this could be to consider fuzzy strings or all fuzzy strings older than, say 3 or 6 months, as valid for conversion in html.
What do you think?
Structural changes - gettext?
On Monday 09 June 2008, Marco Ciampa wrote:
But, in fact, there are more drawbacks in using gettext and .po's than just having a master language. The major ?of them seens to be the need of an in order, ?paragraph by paragraph translation that might not always be the best possible translation - and simply take too much freedom away from the translators.
This drawback does not exist (IMHO) since a gettext string could fit easily more paragraphs at a time. If you go and look at the manual structure, for the largest part of it there _is_ a strict correspondence between the english parargraps (or grouped paragraphs) and the other languages sections.
Several paragraphs - like grouped paragraphs- ina single .po string would indeed have no drawbacks in my opinion.
A one paragraph one string, on the other hand is what I would not like to see - as the transa?tor would have to stick to an almost literal translation of sentence by sentence. With paragaph groups, hoever, there is room for a more natural text, conveing the same information. (And yet, I do think most times the translation will be done paragraph by paragraph - however the few parts where it will feel more natural to rephrase some contents are where very important, IMHO)
js
->
Structural changes - gettext?
Marco Ciampa skreiv:
On Tue, Jun 03, 2008 at 10:18:59AM -0300, Joao S. O. Bueno wrote:
Kolbj?rn (sorry hope this do not offend Kolbj?rn) is not really aware about the workflow of gettext so do not take him as an example...
I'm not offended :-)
I am a novice in this game, without any knowledge of gettext, and was
only wondering if it is possible to go on with the translation even when
using Windows. I'm using poEdit when in need for a program to editing po
files, and was afraid that gettext is similar restricted. For example
showing only two languages at a time, the original and the translation.
That's all.
Kolbj?rn
Structural changes - gettext?
Hi,
IMO, getting grouped para in .pot would be too much hard a work.
.pot are created with the xml2po command which extracts 'en' strings
into .pot msgid strings.
To get grouped para we should have to modify hundreds of xml files, each
of them particular, in every language!
Reading fluency is not a problem since we can have the 'en' html file on the same screen.
A real problem is how to built all .pot files (a .pot file per xml file)
automatically.
Another more difficult problem is how to automatically transfer
translated strings from xml to msgstr strings in .pot files.
Roman is working on that...
Julien
Structural changes - gettext?
On Tuesday 10 June 2008, julien wrote:
Hi,
IMO, getting grouped para in .pot would be too much hard a work. .pot are created with the xml2po command which extracts 'en' strings into .pot msgid strings.
To get grouped para we should have to modify hundreds of xml files, each of them particular, in every language!
Not exactly modifying by hand. The idea of XML files is that they are eaily processed with scripts. We'd jsut need to either modify po2edit or create another script for doing that. (And yes, I could do that)
Reading fluency is not a problem since we can have the 'en' html file on the same screen.
The problem is nto about reading - although I am a bit concerned about this since most PO editing tools (Kbabel, bluefish) do show the _current string_ out of the flow of the adjacent strings (although they can be seen nearby - it is not the same as looking at the text.
This is a problem of one extra featire in .po editor programs, so it does not concern us at this point.
The ability of re-writing the paragraphs and transfer the information around so that translations are not paragraph by paragraph, but block by block, is deeper than that, IMHO, and a an important thing to preserve the text enjoyable accross different languages.
A real problem is how to built all .pot files (a .pot file per xml file) automatically.
Hmm? I thought this was the work of po2edit - what is hard about feeding the xml files to a pot writting script? Sorry, but this sounds much more like a technical thing that can be worked around in a couple of hours (if not minutes) than a "real" problem. Being able to do out of order paragraph translation is IMHO much more an issue than being able to use a "for" statement in bash or any script language.
Another more difficult problem is how to automatically transfer translated strings from xml to msgstr strings in .pot files. Roman is working on that...
Roman - are you? How are you doing? The experience I had in parsing these files 2 years ago might come handy here.
js ->
Julien