proposal for libgimpmetadata API
This discussion is connected to the gimp-developer-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.
This is a read-only list on gimpusers.com so this discussion thread is read-only, too.
proposal for libgimpmetadata API | Raphaël Quinet | 21 Jun 21:15 |
proposal for libgimpmetadata API | Sven Neumann | 21 Jun 22:54 |
proposal for libgimpmetadata API | Raphaël Quinet | 21 Jun 23:53 |
proposal for libgimpmetadata API | Sven Neumann | 22 Jun 00:13 |
proposal for libgimpmetadata API | Raphaël Quinet | 22 Jun 00:35 |
proposal for libgimpmetadata API | Sven Neumann | 22 Jun 08:10 |
proposal for libgimpmetadata API | Raphaël Quinet | 22 Jun 13:33 |
proposal for libgimpmetadata API | Sven Neumann | 22 Jun 20:18 |
proposal for libgimpmetadata API | Kevin Cozens | 27 Jun 22:40 |
proposal for libgimpmetadata API
In order to improve the support for metadata (XMP and EXIF), I would like to move some of the code that is currently located in the plug-ins/metadata directory into a new library, "libgimpmetadata". This library would be linked with all file plug-ins that support metadata. This includes JPEG, PNG, TIFF and maybe GIF, PSD, SVG and other file formats.
The metadata is stored internally as an XMP packet inside a gimp image parasite (so it is transparently saved in xcf files). Instead of manipulating this parasite directly, the file plug-ins can use the libgimpmetadata library for getting or setting individual properties in the XMP packet, for merging EXIF data into the XMP packet or for generating an EXIF block from parts of the XMP data. Due to the need to generate JPEG thumbnails, the library adds a dependency on libjpeg.
Sven asked for a review of the proposed API for this library, so here it is. Note that the library does not define new gobject classes or anything like that. This is simply a set of functions for accessing the metadata that is attached to the image (in a parasite that should be mostly opaque for the user).
/* decode the given XMP packet (read from a file) and merge it into the metadata parasite. */
gboolean
gimp_metadata_decode_xmp (gint32 image_ID,
const gchar *xmp_packet);
/* generate an XMP packet from the metadata parasite */
const gchar *
gimp_metadata_encode_xmp (gint32 image_ID);
/* decode the given EXIF block (read from a file) and merge it into the metadata parasite. */
gboolean
gimp_metadata_decode_exif (gint32 image_ID,
guint exif_size,
const gchar *exif_block);
/* generate an EXIF block from the EXIF-compatible parts of the metadata parasite */
gboolean
gimp_metadata_encode_exif (gint32 image_ID,
guint *exif_size,
const gchar **exif_block);
/* get the value(s) of a single XMP property */
gboolean
gimp_metadata_get (gint32 image_ID,
const gchar *schema,
const gchar *property,
GimpXMPPropertyType *type,
gint *num_values,
const gchar ***values);
/* set the value(s) of a single XMP property */
gboolean
gimp_metadata_set (gint32 image_ID,
const gchar *schema,
const gchar *property,
GimpXMPPropertyType type,
gint num_values,
const gchar **values);
/* delete a single XMP property */
gboolean
gimp_metadata_delete (gint32 image_ID,
const gchar *schema,
const gchar *property);
/* same as gimp_metadata_get() but simpler, for scalar properties */
const gchar *
gimp_metadata_get_scalar (gint32 image_ID,
const gchar *schema,
const gchar *property);
/* same as gimp_metadata_set() but simpler, for scalar properties */
gboolean
gimp_metadata_set_scalar (gint32 image_ID,
const gchar *schema,
const gchar *property,
const gchar *value);
/* register a non-standard XMP schema prefix for use in get/set procs */
gboolean
gimp_metadata_add_schema (gint32 image_ID,
const gchar *schema,
const gchar *schema_prefix);
Example of use: - An image containing both XMP and EXIF information is loaded - Call gimp_metadata_encode_exif (image, exif_size, exif_block) to load the EXIF block into the gimp metadata parasite. - Call gimp_metadata_encode_xmp (image, xmp_packet) to merge the XMP information into the gimp metadata parasite. If some properties are present in both XMP and EXIF (this is very likely), the old EXIF information is overwritten: XMP always takes precedence. - Call gimp_metadata_set_scalar (image, "dc", "contributor", "John Doe"); to set the "dc:contributor" property. - Call gimp_metadata_set_scalar (image, "http://ns.adobe.com/exif/1.0/", "UserComment", "foo!"); to set the "exif:UserComment" property. Here, the full schema URI is used but it would also be possible to use the "exif" prefix because this is a known prefix (standardized). - Call xmp_packet = gimp_metadata_encode_xmp (image); to generate a new XMP packet suitable for saving into a file.
Most of the functions listed above are currently implemented in the metadata plug-in and exported in the PDB. So you can find a slightly longer description of these functions by looking in the Procedure Browser and searching for "metadata". If you are curious, you can also look in the code: plug-ins/metadata/metadata.c is where these functions are registered and plug-ins/metadata/xmp-model.c is where they are implemented. After moving these functions in the new libgimpmetadata library, they would still be exported to the PDB but probably renamed gimp-metadata-* instead of plug-in-metadata-*.
The type GimpXMPPropertyType is an enum that is currently defined as "XMPType" in plug-ins/metadata/xmp-schemas.h. That file would be moved into libgimpmetadata and changed accordingly (XMPProperty and XMPSchema would become private types).
The functions gimp_metadata_get/set_scalar() are simpler versions of the get/set() functions that are easier to use. In this proposed API, all scalar types would be converted to/from strings. Many proporties are text strings (XMP_TYPE_TEXT, XMP_TYPE_URI), but some of them are integers (XMP_TYPE_INTEGER), booleans (XMP_TYPE_BOOLEAN) or slightly more complex types that are "almost" scalar (XMP_TYPE_DATE, XMP_TYPE_MIME_TYPE, XMP_TYPE_RATIONAL). I am not sure if converting these types to/from strings is the best option but that seems reasonable to me. Another option would be to have a special case for integers and booleans: add gimp_metadata_get/set_integer() and maybe rename get/set_scalar() to get/set_string().
The functions for converting to/from EXIF are not ready yet. I have started implementing them but did not go very far yet. I would like to avoid a dependency on libexif or on libtiff (EXIF is actually a TIFF block with a JPEG thumbnail, usually stored inside a JPEG file) because converting EXIF to XMP only requires a small subset of the API provided by these libraries.
The XMP and EXIF thumbnails will be generated as a side-effect of calling gimp_metadata_encode_xmp() or gimp_metadata_encode_exif(). These functions will also update all the other fields that must be automatically updated according to the XMP and EXIF specifications: image dimensions, resolution, modification date, metadata modification date. The XMP specifications also require the MIME type ("dc:format") to be updated automatically, but this can only be done by the file plug-in when saving the file. One option is that all file save plug-ins would call gimp_metadata_set_scalar(image, "dc", "format"...) before requesting the full XMP or EXIF data. Another option would be to add a new function in the API for updating the thumbnail and other metadata explicitely instead of doing it implicitely inside gimp_metadata_encode_xmp/exif(). This function could look like this:
gboolean gimp_metadata_update (gint32 image_ID, gboolean update_thumbnail, const gchar *mime_type);
Once the code is clearly split between the core (library for managing the metadata parasite, XMP and EXIF) and the GUI (metadata editor), I think that it will be easier for me to finish the EXIF parts. I would then welcome some help for the GUI, which is currently a disaster.
-Raphaël
proposal for libgimpmetadata API
Hi,
On Wed, 2006-06-21 at 21:15 +0200, Raphaël Quinet wrote:
Most of the functions listed above are currently implemented in the metadata plug-in and exported in the PDB. So you can find a slightly longer description of these functions by looking in the Procedure Browser and searching for "metadata". If you are curious, you can also look in the code: plug-ins/metadata/metadata.c is where these functions are registered and plug-ins/metadata/xmp-model.c is where they are implemented. After moving these functions in the new libgimpmetadata library, they would still be exported to the PDB but probably renamed gimp-metadata-* instead of plug-in-metadata-*.
I don't really understand that part yet. If the code is in a library that plug-ins can link to, why are the functions exported to the PDB and how exactly does this happen?
Sven
proposal for libgimpmetadata API
On Wed, 21 Jun 2006 22:54:32 +0200, Sven Neumann wrote:
On Wed, 2006-06-21 at 21:15 +0200, Raphaël Quinet wrote:
[...] After moving these functions in the new libgimpmetadata library, they would still be exported to the PDB but probably renamed gimp-metadata-* instead of plug-in-metadata-*.
I don't really understand that part yet. If the code is in a library that plug-ins can link to, why are the functions exported to the PDB and how exactly does this happen?
The file plug-ins would not use these functions via the PDB because they
could use the library directly. However, it is still useful to have
these functions exported to the PDB so that they can be used by scripts.
For example, if you want to have a script that automatically attaches a
Creative Commons license to a file, it would only have to call something
like this:
gimp-metadata-set-scalar (image, "xmpRights", "marked", "true")
gimp-metadata-set-scalar (image, "cc", "license", "http://creativecommons.org/licenses/whatever")
That covers the "why". Regarding the "how", it depends... My long-term goal is to move the metadata viewer/editor into the core (like other info dialogs and so on) because it would be the only way to ensure that it reflects the current state of the metadata while the image is being edited. This would solve some of the annoying concurrency problems: if you open the metadata editor (currently a plug-in), and save the image while the editor is open, then some changes to the metadata are lost because the parasite is modified by the file plug-in but the editor does not know it. I think that I mentioned this before or during the last GIMPCon.
So if/when the metadata viewer/editor is in the core, then it makes sense for the core to export some of these functions to the PDB. But I am not planning on migrating the editor to the core right now. In the meantime, it may be better for the editor (the GUI part) to remain as a plug-in and then it would make sense for it to export these functions to the PDB. But even if these functions are temporarily exported by a plug-in, I think that it would be better to name them "gimp-metadata-*" in order to reflect the intent to move them into the core later, and to avoid breaking scripts that could start using them in the meantime.
-Raphaël
proposal for libgimpmetadata API
Hi,
On Wed, 2006-06-21 at 23:53 +0200, Raphaël Quinet wrote:
The file plug-ins would not use these functions via the PDB because they could use the library directly.
The file plug-ins could as well use the functions via the PDB then. What's the benefit of linking to them?
That covers the "why". Regarding the "how", it depends... My long-term goal is to move the metadata viewer/editor into the core (like other info dialogs and so on) because it would be the only way to ensure that it reflects the current state of the metadata while the image is being edited.
I am somewhat reluctant to see such code in the core, or the core linking to it. Simply because experience shows that parsers aren't perfect and can crash. I wouldn't want to see the core crash because some camera manufacturer made a mistake and the camera creates images with corrupt metadata.
Sven
proposal for libgimpmetadata API
On Thu, 22 Jun 2006 00:13:13 +0200, Sven Neumann wrote:
On Wed, 2006-06-21 at 23:53 +0200, Raphaël Quinet wrote:
The file plug-ins would not use these functions via the PDB because they
could use the library directly.
The file plug-ins could as well use the functions via the PDB then. What's the benefit of linking to them?
Cleaner code (core/GUI separation, maintainable by different people), lower overhead (especially when changing many properties) and more importantly providing the start of a solution for avoiding the concurrency issues that I mentioned earlier.
I am somewhat reluctant to see such code in the core, or the core linking to it. Simply because experience shows that parsers aren't perfect and can crash. I wouldn't want to see the core crash because some camera manufacturer made a mistake and the camera creates images with corrupt metadata.
I understand your concerns. However, I do not see another way to view and modify the metadata in real time. Viewing the metadata should be like viewing any other image properties (info dialog) and doing this in a plug-in that does not know when the image or its metadata is modified means that the metadata displayed by the plug-in may not match the current data: wrong image dimensions, color space, etc. And worse, the plug-in may override some changes to the metadata if it updates it after ignoring other changes that happened outside of its control.
With the current plug-in, you may get annoying results such as an image saved with the wrong thumbnail or with other incorrect metadata, just because you forgot to close the metadata editor before saving and to re-open it just after saving. This should not happen in the core.
Besides, I try to have a parser that is as robust as possible. ;-)
-Raphaël
proposal for libgimpmetadata API
Hi,
On Thu, 2006-06-22 at 00:35 +0200, Raphaël Quinet wrote:
The file plug-ins could as well use the functions via the PDB then. What's the benefit of linking to them?
Cleaner code (core/GUI separation, maintainable by different people), lower overhead (especially when changing many properties) and more importantly providing the start of a solution for avoiding the concurrency issues that I mentioned earlier.
Sorry, but I don't follow you on the first arguments, they seem unrelated. Where the code lives has nothing to do with how clean it is. So the argument for having this library now is that it allows avoiding the concurrency issues. But that's something that we will only get later anyway. That makes me think that it will be best to keep the code in a plug-in, accessible over the PDB. That should give us everything we need for 2.4 and avoids the need for defining an API now that we might have to support for quite a while.
I am somewhat reluctant to see such code in the core, or the core linking to it. Simply because experience shows that parsers aren't perfect and can crash. I wouldn't want to see the core crash because some camera manufacturer made a mistake and the camera creates images with corrupt metadata.
I understand your concerns. However, I do not see another way to view and modify the metadata in real time. Viewing the metadata should be like viewing any other image properties (info dialog) and doing this in a plug-in that does not know when the image or its metadata is modified means that the metadata displayed by the plug-in may not match the current data: wrong image dimensions, color space, etc. And worse, the plug-in may override some changes to the metadata if it updates it after ignoring other changes that happened outside of its control.
I understand your concerns. But since you said that we aren't going to have this code in the core for 2.4, what's the point of preparing that move now? If we have a little more time, we can find other ways to avoid the concurrency problem. The core could for example signal image changes to plug-ins that ask for such notifications. That would be useful for a lot of plug-ins and we could add such functionality right after 2.4.
Sven
proposal for libgimpmetadata API
On Thu, 22 Jun 2006 08:10:19 +0200, Sven Neumann wrote:
On Thu, 2006-06-22 at 00:35 +0200, Raphaël Quinet wrote:
Cleaner code (core/GUI separation, maintainable by different people), lower overhead (especially when changing many properties) and more importantly providing the start of a solution for avoiding the concurrency issues that I mentioned earlier.
Sorry, but I don't follow you on the first arguments, they seem unrelated. Where the code lives has nothing to do with how clean it is.
Well, let's say that it makes it a bit easier if the code is at least split in separate directories (even if a library is not required for that). As for the lower overhead, it does make a difference if the file plug-ins can link with a library directly or if they have to convert all data and pass it through the PDB for every action, especially when these PDB calls have to spawn a new process which in turn decodes the data, re-encodes it again for the PDB (setting the parasite) and then only returns the result to the initial plug-in. These 4 encoding/decoding steps and spawning of new processes have to be done for every single property that is set or read by the file plug-ins.
So the argument for having this library now is that it allows avoiding the concurrency issues. But that's something that we will only get later anyway. That makes me think that it will be best to keep the code in a plug-in, accessible over the PDB. That should give us everything we need for 2.4 and avoids the need for defining an API now that we might have to support for quite a while.
To be frank, I thought that you agreed with the principle of moving the metadata core into a library when we discussed it last time. I thought that the discussion about the API would be mainly about the features that the API should expose, rather than whether it should exist as a library API or be limited to the PDB. I got a bit bored with the PDB implementation and its limitations, that's why I expected to be more motivated to work on the remaining issues after getting rid of the extra PDB calls.
[...] I wouldn't want to see the core crash because some camera manufacturer made a mistake and the camera creates images with corrupt metadata.
By the way, crashing the core because of corrupt metadata is not that likely: the core would only handle the metadata that is already in the parasite. Except for corrupt XCF files (unlikely to come from a camera!), the only way for the metadata to be put in the parasite is via the editor (the user enters some data) or via the file plug-ins, which would be based on the same libgimpmetadata library. So even if there is a bug that could crash the parser despite the safeguards in the code, that bug would affect the file plug-ins before they have a chance to store the metadata in the parasite and hand it over to the core. Of course there is always Murphy's law...
I understand your concerns. But since you said that we aren't going to have this code in the core for 2.4, what's the point of preparing that move now? If we have a little more time, we can find other ways to avoid the concurrency problem. The core could for example signal image changes to plug-ins that ask for such notifications. That would be useful for a lot of plug-ins and we could add such functionality right after 2.4.
I agree that such a notification feature would be useful for a lot of plug-ins. However, my goal is really to have the metadata in the core, handled like any other information about the image: resolution, colorspace, image comment, etc. With the metadata handling in the core, I would like to have some parts visible in the preferences: the "Default New Image" tab has an expander with "Advanced Options". Instead of the generic "Comment" field, there would be separate options to specify "Author" and "Copyright" and I could also add an option for selecting a default license. Another point of integration in the core would be a dockable dialog similar to the "Pointer Information" dialog (or maybe to the currently non-dockable Image->Image Properties dialog) that displays some of the most important parts of the metadata for the active image. This could be configurable: some users may be interested in the date and location, while others are more interested in the dc:description or dc:creator. The core could also notify the user if there is an explicit copyright on an image (some other programs warn you if you attempt to modify or save an image that has a copyright, but I would rather go for a non-intrusive notification).
You can probably argue that some of these features could also be implemented outside the core, but then they would not be very well integrated. For example, it would be difficult to select a default license for all new images if this is not done in the main preferences.
-Raphaël
proposal for libgimpmetadata API
Hi,
On Thu, 2006-06-22 at 13:33 +0200, Raphaël Quinet wrote:
Well, let's say that it makes it a bit easier if the code is at least split in separate directories (even if a library is not required for that). As for the lower overhead, it does make a difference if the file plug-ins can link with a library directly or if they have to convert all data and pass it through the PDB for every action, especially when these PDB calls have to spawn a new process which in turn decodes the data, re-encodes it again for the PDB (setting the parasite) and then only returns the result to the initial plug-in. These 4 encoding/decoding steps and spawning of new processes have to be done for every single property that is set or read by the file plug-ins.
As far as I understood your proposed API, the calls go through the PDB anyway. The API uses image IDs, so the parasite has to be retrieved from the core through the PDB, no?
To be frank, I thought that you agreed with the principle of moving the metadata core into a library when we discussed it last time. I thought that the discussion about the API would be mainly about the features that the API should expose, rather than whether it should exist as a library API or be limited to the PDB. I got a bit bored with the PDB implementation and its limitations, that's why I expected to be more motivated to work on the remaining issues after getting rid of the extra PDB calls.
Last time we discussed this, I didn't expected you to wait with the move until we are about to freeze the APIs and start doing pre-releases. I have the feeling that you are pushing something into 2.4 that doesn't absolutely have to be in. But I may still be persuaded that we need it for 2.4.
I agree that such a notification feature would be useful for a lot of plug-ins. However, my goal is really to have the metadata in the core, handled like any other information about the image: resolution, colorspace, image comment, etc.
It would be very nice if we wouldn't handle all this in the core. Image comment could very well be handled by a plug-in. And color management is being deliberately kept out of the core.
You can probably argue that some of these features could also be implemented outside the core, but then they would not be very well integrated. For example, it would be difficult to select a default license for all new images if this is not done in the main preferences.
Ideally, plug-ins could register their configuration in the gimprc (to some extent they already can do this) and make it accessible in the Preferences dialog. We have already started to work towards this when we introduced libgimpconfig.
The whole point here is however, do we need this library now or can it wait until after 2.4 has been released? That's the only question that really bothers me right now.
Sven
proposal for libgimpmetadata API
Raphaël Quinet wrote:
/* decode the given XMP packet (read from a file) and merge it into the metadata parasite. */ gboolean
gimp_metadata_decode_xmp (gint32 image_ID, const gchar *xmp_packet);/* generate an XMP packet from the metadata parasite */ const gchar *
gimp_metadata_encode_xmp (gint32 image_ID);/* decode the given EXIF block (read from a file) and merge it into the metadata parasite. */ gboolean
gimp_metadata_decode_exif (gint32 image_ID, guint exif_size, const gchar *exif_block);/* generate an EXIF block from the EXIF-compatible parts of the metadata parasite */ gboolean
gimp_metadata_encode_exif (gint32 image_ID, guint *exif_size, const gchar **exif_block);
The prototype for gimp_metadata_encode_xmp() seems inconsistent to the pattern of the other functions you listed. I would have expected it to be:
gboolean gimp_metadata_encode_xmp (gint32 image_ID, guint *xmp_size, const gchar **xmp_block);
Example of use:
- An image containing both XMP and EXIF information is loaded - Call gimp_metadata_encode_exif (image, exif_size, exif_block) to load the EXIF block into the gimp metadata parasite. - Call gimp_metadata_encode_xmp (image, xmp_packet) to merge the XMP information into the gimp metadata parasite. If some properties are present in both XMP and EXIF (this is very likely), the old EXIF information is overwritten: XMP always takes precedence.
Are you proposing in step 2 of your "Example of use" above that it be done automatically in the file load plug-ins? I think that would make sense. Any XMP/EXIF data for an image should be converted to parasites on file load and coverted back to XMP/EXIF data on file save.