RSS/Atom feed Twitter
Site is read-only, email is disabled

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

This discussion is connected to the gimp-developer-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.

This is a read-only list on gimpusers.com so this discussion thread is read-only, too.

14 of 15 messages available
Toggle history

Please log in to manage your subscriptions.

GIMP GBR format spec Sven Neumann 16 Jul 17:35
  GIMP GBR format spec Nick Lamb 16 Jul 20:44
A comment on CinePaint (was Re: new-xcf) Robin Rowe 18 Jul 06:33
20030717165600.B86E1102E3@l... 07 Oct 20:22
  new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Alan Horkan 17 Jul 20:05
   new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Manish Singh 17 Jul 20:53
    new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Roger Leigh 17 Jul 23:22
     new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Alan Horkan 18 Jul 01:20
   new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Christopher Curtis 17 Jul 23:10
    new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Alan Horkan 18 Jul 01:41
     new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Christopher W. Curtis 18 Jul 03:52
      new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Joao S. O. Bueno 18 Jul 15:07
       new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Tomas Ogren 18 Jul 22:16
        new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Marc) (A.) (Lehmann 20 Jul 00:15
         new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18] Adam D. Moss 20 Jul 12:45
Sven Neumann
2003-07-16 17:35:42 UTC (over 21 years ago)

GIMP GBR format spec

Hi,

tino.schwarze@informatik.tu-chemnitz.de (Tino Schwarze) writes:

I think, the security argument against JAR is very far-fetched. A JAR is basically a ZIP with a META-INF directory containing a MANIFEST.MF file. That's it.

There is a lot of code around for creating / reading ZIP files - I'm a bit worried about robustness though; if the directory at the end of the ZIP is broken or missing, things get complicated.

I don't think we should use a compressed archive. Instead the binary data in the archive should be compressed. That allows to choose the best compression scheme for the data and to combine different compression techniques in the archive. Compressing the whole archive again would probably only reduce the size marginally and would add unneccessary complexity. You robustness argument is also a very good argument against compressing the whole archive.

But a hierarchical structure would be cool too. What about mapping big parts of the file format to the file system? This way, a lot of information can be stored in the hierarchy and it wouldn't be a big difference whether to read a file from file system or from archive.

As I pointed out in an earlier mail, I am not sure if a hierarchical structure in the archive is a good idea. In my opinion the hierarchy should only be defined in the XML part that describes how the contents of the archive should be put together. If we apply the document hierarchy to the archive, it will become painful to keep the XML description and the archive hierarchy in sync.

Sven

Nick Lamb
2003-07-16 20:44:44 UTC (over 21 years ago)

GIMP GBR format spec

On Wed, Jul 16, 2003 at 05:35:42PM +0200, Sven Neumann wrote:

I don't think we should use a compressed archive. Instead the binary data in the archive should be compressed. That allows to choose the best compression scheme for the data and to combine different compression techniques in the archive.

Yes! Just in case anyone taking part in this discussion didn't know, the compression routines used by gzip and ZIP are generic text compression techniques, they don't understand interleaved formats (commonly used for RGB data, stereo audio etc.) nor multi-byte representations such as the 32-bit IEEE floats we might be using in a few years in The GIMP so they produce rather poor results compared to specialised compression techniques which The GIMP could inherit from existing Free Software.

Nick.

Alan Horkan
2003-07-17 20:05:45 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On Thu, 17 Jul 2003 gimp-developer-request@lists.xcf.berkeley.edu wrote:

If we really are in brainstorming mode here, following the suggestions listed above, how about a format something like the following, which is essentially just an XML preamble, followed by raw binary data:

The nice thing about this is that it should be fully parseable by XML parsers (up until the first NULL [1 is required, the rest are optional

Fully parseable XML until it is isn't :)

It is far better not to XML at all than to break XML. (incidentally this is similar to what has been suggested for Cinepaint).

The proper XML way to do what you describe would be to take the raw binary and base 64 encode it (ick) which is grossly inefficient for anything large. The more sensible way and still valid way is to use a container format and to link to the raw BLOB (binary large object) that would be another file in your container format.

I just don't see using another archive format as giving you anything. So say you use ZIP or JAR or TAR or AR: you still have to unpack (and possibly decompress) the thing just to find out what's in it.

OTOH, any

Using Zip as a container is not "On The Other Hand", it does not prevent any of the things you are suggesting.

Zip allows you to grab just one file out of the archive if that is all you want, you can have differnt files inside a Zip archive each with different amounts of compression (including no compression).

program that can open a file can read the XML header here, even if they don't parse it, it's still human readable. And this lets you do your

run 'head' on an OpenOffice document and you will see that the manifest is left uncompresses so that you can easily read it as text.

fancy compression-based-on-data-type instead of generic-text-compression over each layer or the whole archive.

If GIMP were to use Zip/Jar only as a conatianer and not use the Zip compression then the whole container could be compressed using different / "better" compression if that is what you want. (I say "better" very guardedly because compression is often a trade off against how long you want to spend compressing or decompressing).

yosh wrote:

data offset is not predictable. But I assert that that is irrelevant because you can specify it to be anywhere.

Another downside: needing a special tool to manipulate it.

To reiterate what Yosh said, Needing specialised external tools would require more developement work, and add complexity not make things simpler. By reusing standards you can leverage existing tools, libraries and other peoples work, leaving more time to focus on image manipulation.

That's the advantage of using a standard format. Using standard tools to manipulate it. More likelihood of a machine having a tool installed, and less work for the GIMP team in maintaining special tools.

You're right about simplicity though, and ar is simpler than tar or zip/jar, which is why I prefer it. zip/jar is especially crappy since the file index is at the end, which means it's harder to recover from a partial file.

I think the JAR format gets around the Zip crappiness by putting the manifest.xml at the start of the file. I could not say how hard it is but Winzip seems to do a pretty good job of repairing any broken zip archives I throw at it, at least allowing me to get some of the files out.

- Alan

Manish Singh
2003-07-17 20:53:40 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On Thu, Jul 17, 2003 at 07:05:45PM +0100, Alan Horkan wrote:

I think the JAR format gets around the Zip crappiness by putting the manifest.xml at the start of the file.

Not on the OpenOffice document I have here. manifest.xml is at the end of the file, and compressed. Even if it did, manifest.xml doesn't have (and shouldn't have) container format offsets and lengths. So it doesn't help.

I could not say how hard it is but Winzip seems to do a pretty good job of repairing any broken zip archives I throw at it, at least allowing me to get some of the files out.

Not everyone has winzip.

I don't see a compelling argument to use zip/jar. It's complexity that doesn't buy us anything over ar.

-Yosh

Christopher Curtis
2003-07-17 23:10:02 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

Alan Horkan wrote:

It is far better not to XML at all than to break XML. (incidentally this is similar to what has been suggested for Cinepaint).

Just for the record ... I read the CinePaint file format, and it doesn't even resemble XML. My "PREAMBLE" is valid XML. If they implement what they have written, they don't even bother with things like closing tags or putting parameters in quotes.

The proper XML way to do what you describe would be to take the raw binary and base 64 encode it (ick) which is grossly inefficient for anything

Which is why I said preamble.

large. The more sensible way and still valid way is to use a container format and to link to the raw BLOB (binary large object) that would be another file in your container format.

Which is what, at this point, I would prefer.

OTOH, any

Using Zip as a container is not "On The Other Hand", it does not prevent any of the things you are suggesting.

Using a container at all is OTOH.

run 'head' on an OpenOffice document and you will see that the manifest is left uncompresses so that you can easily read it as text.

OpenOffice documents are zipped; you can't head them.

btw: META-INF/manifest.xml is at the end of my .sxi file.

Chris

Roger Leigh
2003-07-17 23:22:17 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

Manish Singh writes:

I don't see a compelling argument to use zip/jar. It's complexity that doesn't buy us anything over ar.

$ ar t gimp1.2-print_4.2.5-4_i386.deb debian-binary
control.tar.gz
data.tar.gz

The Debian dpkg ".deb" package format uses an ar archive with gzip compressed members. It's very robust, and it's simple to extract information from any of the members as needed. e.g.

$ ar p gimp1.2-print_4.2.5-4_i386.deb control.tar.gz | tar xfz - ./md5sums $ cat md5sums
3698d1f4ce3025bc8c0af73aad39c351 usr/lib/gimp/1.2/plug-ins/print a9e993933c62cf972a07ba60d099a5be usr/share/doc/gimp1.2-print/html/FAQ.html 0f06e25e158d58be369f6c81c74f350f usr/share/doc/gimp1.2-print/html/print-color.png 8af9040e743fdea01d048a9625be3f37 usr/share/doc/gimp1.2-print/html/print-main.png 9bcaba3b091edb324a4e3658d3b4c17b usr/share/doc/gimp1.2-print/html/print-setup.png bdb27f0b9e600cbf067b34d26b62727b usr/share/doc/gimp1.2-print/samples/colorbars4.png cd6014ab378eeebbaee1723b78ef4459 usr/share/doc/gimp1.2-print/samples/colorsweep.png a9e993933c62cf972a07ba60d099a5be usr/share/doc/gimp1.2-print/FAQ.html d41331233e7703ff0c7f365a1f1fa2a4 usr/share/doc/gimp1.2-print/README.Debian 37ae0a31af00c0fa8569104c96927391 usr/share/doc/gimp1.2-print/copyright 9581201d2bf1fc7b5fcf4c3463d79854 usr/share/doc/gimp1.2-print/changelog.gz 7b58392e6bc678907651c89bdb134763 usr/share/doc/gimp1.2-print/README.gz 5fa90d8012eebb8038dd991d46155ff5 usr/share/doc/gimp1.2-print/changelog.Debian.gz

Alan Horkan
2003-07-18 01:20:04 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On Thu, 17 Jul 2003, Roger Leigh wrote:

Date: Thu, 17 Jul 2003 22:22:17 +0100 From: Roger Leigh
To: Manish Singh
Cc: Alan Horkan , gimp-developer@lists.xcf.berkeley.edu Subject: Re: [Gimp-developer] Re: new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

Manish Singh writes:

I don't see a compelling argument to use zip/jar. It's complexity that doesn't buy us anything over ar.

$ ar t gimp1.2-print_4.2.5-4_i386.deb debian-binary
control.tar.gz
data.tar.gz

The Debian dpkg ".deb" package format uses an ar archive with gzip compressed members. It's very robust, and it's simple to extract information from any of the members as needed. e.g.

$ ar p gimp1.2-print_4.2.5-4_i386.deb control.tar.gz | tar xfz - ./md5sums

you used a pipe

what happened there is that you just unzipped the entire archive then in a seperate operation you extracted just the files you wanted.

there is nothing wrong with that, you get better compression that way.

In a zip archive you really do just extract the single file, there is no unzipping of the whole archive first, which is useful if you just want to grab one or two files quickly from a large archive.

- Alan

Alan Horkan
2003-07-18 01:41:59 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On Thu, 17 Jul 2003, Christopher Curtis wrote:

Date: Thu, 17 Jul 2003 17:10:02 -0400 From: Christopher Curtis
To: Alan Horkan
Cc: gimp-developer@lists.xcf.berkeley.edu Subject: Re: [Gimp-developer] Re: new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

It is far better not to XML at all than to break XML. (incidentally this is similar to what has been suggested for Cinepaint).

even resemble XML. My "PREAMBLE" is valid XML. If they implement what they have written, they don't even bother with things like closing tags or putting parameters in quotes.

A preamble, which is effectively full XML file, a boundry then more information which is effectively another file. Two files in one file, sounds like an ad-hoc container to me.

Which is what, at this point, I would prefer.

OTOH, any

Using Zip as a container is not "On The Other Hand", it does not prevent any of the things you are suggesting.

Using a container at all is OTOH.

run 'head' on an OpenOffice document and you will see that the manifest is left uncompresses so that you can easily read it as text.

OpenOffice documents are zipped; you can't head them.

btw: META-INF/manifest.xml is at the end of my .sxi file.

I made a terrible mistake of generalising from one instance. It is doable it but it was just coincidental in that it was done that way in the file I looked at.

While I am apologising, I may as well repeat what I said offlist. I only used Winzip as an example, there are several programs which can recover parts of zip files, so repairing damaged zip files is possible (although I cant guess how difficult it is do it). I expect there must be command line tool for unix zip files, i just dont happen to know what it is yet.

- Alan

Christopher W. Curtis
2003-07-18 03:52:04 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On 07/17/03 19:41, Alan Horkan wrote:

On Thu, 17 Jul 2003, Christopher Curtis wrote:

even resemble XML. My "PREAMBLE" is valid XML. If they implement what they have written, they don't even bother with things like closing tags or putting parameters in quotes.

A preamble, which is effectively full XML file, a boundry then more information which is effectively another file. Two files in one file, sounds like an ad-hoc container to me.

As interesting as what I said was, I don't see how your comment logically follows. Anyway ...

I only used Winzip as an example, there are several programs which can recover parts of zip files, so repairing damaged zip files is possible (although I cant guess how difficult it is do it).

This is something that shouldn't really be an issue. The ZIP format keeps the list of files at the end, so that if the file is clipped, the directory is lost, and you can recreate it by scanning the archive for delimiters. The reason it can be repared at all is because the most likely thing to get lost is the meta-information.

So, after some research, I've decided that ar is a fine container format. My only conribution, which you may take as you will, would be to specify that the first entry in the archive is the descriptive catalog. Naturally I'm thinking the XML snippet I stated earlier, sans the data offset thing.

The advantage to this is that you can detect if the file is corrupt, and you have two ways or accessing data: via meta-information only, or via the actual data entry. This means there's no need to scan through the archive to find its contents, and means that you can read the file using more and it works fine (as long as the XML file is uncompressed).

The downside to using 'ar', really, is that WinZip doesn't support it. I haven't verified this - I hope a Windows user can do so for us. Just for reference, attached below is a C&P of an ar archive I just made:

bash-2.05b$ echo 1 > file1 bash-2.05b$ echo 2 > file2
bash-2.05b$ ar r myar.xcf file*
bash-2.05b$ (echo --; cat myar.xcf; echo --) --
!
file1/ 1058492021 1000 1000 100644 2 ` 1
file2/ 1058492025 1000 1000 100644 2 ` 2
--

Chris

Robin Rowe
2003-07-18 06:33:25 UTC (over 21 years ago)

A comment on CinePaint (was Re: new-xcf)

At 5:10 PM -0400 7/17/03, Christopher Curtis wrote:

Just for the record ... I read the CinePaint file format, and it doesn't even resemble XML.

Yeah, I've had that argument with Robin - and lost :(.

They are going for simple and scriptable over good design - I think they will regret it ver soon...

Actually, it was simple and scriptable over good *XML* design that was my criteria for CPX. Whether good design and good XML design are the same is a matter of opinion and circumstances.

In noting in the CPX spec that I had reused some good ideas from PPM P6 and XML I had not intended to suggest CPX was XML.

Cheers,

Robin --------------------------------------------------------------------------- Robin.Rowe@MovieEditor.com Hollywood, California www.CinePaint.org Free motion picture and still image editing software

Joao S. O. Bueno
2003-07-18 15:07:21 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

Christopher W. Curtis wrote:

The downside to using 'ar', really, is that WinZip doesn't support it. I haven't verified this - I hope a Windows user can do so for us. Just for reference, attached below is a C&P of an ar archive I just made:

Hmm..that just seens just plain as no downside at all. You see..windows users don't even have a comom tool to edit large ASCII files. Saying that a proprietary tool doesn't support this archive type should be of no concern.

They will be able to open the New Gimp File based on ar on Microsoft Word, if there is such a need of a format hackeable by windows users.

Chris

Tomas Ogren
2003-07-18 22:16:27 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On 18 July, 2003 - Joao S. O. Bueno sent me these 0,8K bytes:

Christopher W. Curtis wrote:

The downside to using 'ar', really, is that WinZip doesn't support it. I haven't verified this - I hope a Windows user can do so for us. Just for reference, attached below is a C&P of an ar archive I just made:

Hmm..that just seens just plain as no downside at all. You see..windows users don't even have a comom tool to edit large ASCII files.

vim? emacs? .. I bet there are many editors that can handle large text files..

/Tomas

Marc) (A.) (Lehmann
2003-07-20 00:15:48 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

On Fri, Jul 18, 2003 at 10:16:27PM +0200, Tomas Ogren wrote:

vim? emacs? .. I bet there are many editors that can handle large text files..

One thing (to bring this more on-topic again) to note is that vim doesn't handle "large" (gigabytes) files nice, loading it into memory. The same is probably true for emacs. The only editor I know (I didn't test millions of them though), that nicely handles large files is joe, as it does't load them into memory.

For images, which might become big especially when storing a lot of extra info (undo info etc.), this is an issue ;)

Adam D. Moss
2003-07-20 12:45:43 UTC (over 21 years ago)

new-xcf [Re: Gimp-developer Digest, Vol 10, Issue 18]

pcg@goof.com ( Marc) (A.) (Lehmann ) wrote: > One thing (to bring this more on-topic again) to note is that vim doesn't > handle "large" (gigabytes) files nice, loading it into memory. The same > is probably true for emacs. The only editor I know (I didn't test millions
> of them though), that nicely handles large files is joe, as it does't load
> them into memory.

QEmacs does this too:

http://fabrice.bellard.free.fr/qemacs/

I think it's only for unixoids. Not sure.