Bug report for ID64917

You are here:

This discussion is connected to the gimp-developer-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.

This is a read-only list on gimpusers.com so this discussion thread is read-only, too.

6 of 6 messages available

Toggle history

Please log in to manage your subscriptions.
Bug report for ID64917	Sashi Kumar	04 Apr 11:54
Bug report for ID64917	Mukund Sivaraman	11 Apr 08:34
Bug report for ID64917	Mukund Sivaraman	11 Apr 08:50
Bug report for ID64917	Sashi Kumar	11 Apr 10:54
Bug report for ID64917	Mukund Sivaraman	11 Apr 11:12
Bug report for ID64917	Sashi Kumar	11 Apr 11:52

Bug report for ID64917

This is the bug report for https://bugzilla.gnome.org/show_bug.cgi?id=649172

The bug is that g_markup_unescape_text() does not unescape the HTML characters. This is the problem while opening the .map file. Also this is not dependent on the format(map type) of the map file.

My patch in comment #3 is wrong. As I mentioned in comment #5 the patch in comment #2 is not precise since it did not satisfy many testcases. So I will explain the problems in that.

When I tried with input: "sample entity : &sash; ' " The expected output should be: "sample entity : &sash; ' " But the result was: "sample ent ty : &sash; ' "

The exact problem is with "strcpy(p, p + strlen(tab[i].enc)-1);" To prove that, I made couple of simple codes using 'strcpy' and user-defined 'strcopy' The output of these two shows the problem.

/* test1.c */ #include
#include
#include

int main ()
{
gchar input[] = "sample entity";
gchar *p = input;
printf ("\n%s\n", input);
strcpy (p, p + 1);
printf ("\n%s\n", input);
return 0;
}

[sashi@SashiPC ~]$ gcc -Wall -g -o test1 test1.c $(pkg-config --cflags --libs glib-2.0)
[sashi@SashiPC ~]$ ./test1

sample entity

ample eentity

/* test2.c */ #include
#include
#include

gchar* strcopy (gchar *dest, gchar *src);

int main () {
gchar input[] = "sample entity";
gchar *p = input;
printf ("\n%s\n", input);
strcopy (p, p + 1);
printf ("\n%s\n", input);
return 0;
}

gchar *
strcopy (gchar *dest,
gchar *src)
{
gchar *d = dest;
gchar *s = src;

do
*d++ = *s;
while (*s++ != '\0');

return d - 1;

} [sashi@SashiPC ~]$ gcc -Wall -g -o test2 test2.c $(pkg-config --cflags --libs glib-2.0)
[sashi@SashiPC ~]$ ./test2

sample entity

ample entity

So the Expected Patch is:

--- imap_csim.y.old 2013-04-04 00:17:15.564849250 +0530 +++ imap_csim.y 2013-04-04 15:40:43.693297813 +0530 @@ -38,6 +38,9 @@
extern int csim_lex(void);
extern int csim_restart(FILE *csim_in); static void csim_error(char* s);
+static gchar* unescape_text(gchar *); +static gchar* strcopy (gchar *dest, gchar *src); +

static enum {UNDEFINED, RECTANGLE, CIRCLE, POLYGON} current_type; static Object_t *current_object;
@@ -260,7 +263,7 @@
if (current_type == UNDEFINED) { g_strreplace(&_map_info->default_url, $3); } else {
- object_set_url(current_object, $3); + object_set_url(current_object, unescape_text($3)); }
g_free ($3);
}
@@ -280,42 +283,42 @@

alt_tag : ALT '=' STRING {
- object_set_comment(current_object, $3); + object_set_comment(current_object, unescape_text($3)); g_free ($3);
}
;

target_tag : TARGET '=' STRING {
- object_set_target(current_object, $3); + object_set_target(current_object, unescape_text($3)); g_free ($3);
}
;

onmouseover_tag : ONMOUSEOVER '=' STRING {
- object_set_mouse_over(current_object, $3); + object_set_mouse_over(current_object, unescape_text($3)); g_free ($3);
}
;

onmouseout_tag : ONMOUSEOUT '=' STRING {
- object_set_mouse_out(current_object, $3); + object_set_mouse_out(current_object, unescape_text($3)); g_free ($3);
}
;

onfocus_tag : ONFOCUS '=' STRING {
- object_set_focus(current_object, $3); + object_set_focus(current_object, unescape_text($3)); g_free ($3);
}
;

onblur_tag : ONBLUR '=' STRING {
- object_set_blur(current_object, $3); + object_set_blur(current_object, unescape_text($3)); g_free ($3);
}
;
@@ -347,3 +350,53 @@
}
return status;
}
+
+static gchar*
+unescape_text (gchar *input)
+{
+ /*
+ * We "unescape" simple things "in place", knowing that unescaped strings always are
+ * shorter than the original input. + *
+ * It is a shame there is no g_markup_unescape_text() function, but instead you have
+ * to create a full GMarkupParser/Context. + */
+ struct token {
+ const char *enc, unenc;
+ };
+ const struct token tab[] = {
+ { """, '"' },
+ { "'", '\'' },
+ { "&", '&' },
+ { "<", '' }
+ };
+ size_t i;
+
+ for (i = 0; i < sizeof(tab)/sizeof(tab[0]); i++) { + char *p;
+ for (p = strstr(input, tab[i].enc); p != NULL; p = strstr(p, tab[i].enc)) { + *p++ = tab[i].unenc;
+ strcopy(p, p + strlen(tab[i].enc)-1); + if (*p == 0)
+ break;
+ }
+ }
+
+
+ return input;
+}
+
+static gchar*
+strcopy (gchar *dest,
+ gchar *src)
+{
+ gchar *d = dest;
+ gchar *s = src;
+
+ do
+ *d++ = *s;
+ while (*s++ != '\0');
+
+ return d - 1;
+}

This patch works good :)

Bug report for ID64917

Hi Sashi

On Thu, Apr 04, 2013 at 05:24:34PM +0530, Sashi Kumar wrote:

This is the bug report for https://bugzilla.gnome.org/show_bug.cgi?id=649172

I'm going to commit the patch in comment #7 (modified from the patch in comment #2) and push it. Thank you for reviewing the patch.

Two things you can do:

* An unrelated crash happens when alt="" is present in a map file, during load. Can you check what causes it and propose a fix?

* Please propose a fix for UTF-8 un-escaping.

Mukund

Bug report for ID64917

On Thu, Apr 11, 2013 at 02:04:19PM +0530, Mukund Sivaraman wrote:

Two things you can do:
* An unrelated crash happens when alt="" is present in a map file, during load. Can you check what causes it and propose a fix?
* Please propose a fix for UTF-8 un-escaping.

One more thing:

* Do the other parsers (NCSA and CERN) also need unescaping?

Please can you check and make patches for these? You can open new bugs for these to track them.

Mukund

Bug report for ID64917

Two things you can do:
* An unrelated crash happens when alt="" is present in a map file, during load. Can you check what causes it and propose a fix?

Yeah, I already started working on it. It is due to double free. This crash can be avoided if MALLOC_CHECK_ environment variable is set to 0. But anyhow I will try to produce a fix in the code for this.

* Please propose a fix for UTF-8 un-escaping.

I will start that soon.

One more thing:
* Do the other parsers (NCSA and CERN) also need unescaping?

Please can you check and make patches for these? You can open new bugs for these to track them.

As far as I checked till now, they don't need unescaping because they don't write the HTML encodings. CSIM only needs unescaping.

Bug report for ID64917

Hi Sashi

On Thu, Apr 11, 2013 at 04:24:29PM +0530, Sashi Kumar wrote:

Two things you can do:
* An unrelated crash happens when alt="" is present in a map file, during load. Can you check what causes it and propose a fix?
Yeah, I already started working on it. It is due to double free. This crash can be avoided if MALLOC_CHECK_ environment variable is set to 0. But anyhow I will try to produce a fix in the code for this.

As you probably understand too, updating MALLOC_CHECK_ is neither a fix nor a workaround. :)

The checking exists to catch such errors, and updating this environment variable will only turn off the checking, but it will not stop the buggy code from being executed.

* Please propose a fix for UTF-8 un-escaping.
I will start that soon.

Nod.

One more thing:
* Do the other parsers (NCSA and CERN) also need unescaping?
As far as I checked till now, they don't need unescaping because they don't write the HTML encodings. CSIM only needs unescaping.

Super. Thank you for checking. :)

Mukund

Bug report for ID64917

Two things you can do:
* An unrelated crash happens when alt="" is present in a map file, during load. Can you check what causes it and propose a fix?
Yeah, I already started working on it. It is due to double free. This
crash
can be avoided if MALLOC_CHECK_ environment variable is set to 0. But anyhow I will try to produce a fix in the code for this.
As you probably understand too, updating MALLOC_CHECK_ is neither a fix nor a workaround. :)
The checking exists to catch such errors, and updating this environment variable will only turn off the checking, but it will not stop the buggy code from being executed.

That's right :) I will try to fix :)

* Please propose a fix for UTF-8 un-escaping.
I will start that soon.
Nod.
One more thing:
* Do the other parsers (NCSA and CERN) also need unescaping?
As far as I checked till now, they don't need unescaping because they
don't
write the HTML encodings. CSIM only needs unescaping.
Super. Thank you for checking. :)

yw :)

Latest comments

Daniel
over 3 years ago

After many years your tutorial is still very helpful! Thanks a lot! Smoke and Flames
Chul, Kim
over 3 years ago

t's my first time seeing this program. GIMP 2.10.18 (Windows)
PatrickEdwards
over 3 years ago

good to see Create a new species by merging two or more animals!
SPIKETHMAN
over 3 years ago

I WANT TO CREATE A CREATURE Create a new species by merging two or more animals!

Latest news

Featured content

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Current versions

Download latest stable
GIMP: 2.10.18

Advertisement

10 latest tutorials

Latest comments

10 random tutorials

Latest discussions

Latest news

Featured content

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Bug report for ID64917

Current versions

Download latest stable GIMP: 2.10.18

Advertisement

10 latest tutorials

Latest comments

10 random tutorials

Latest discussions

Download latest stable
GIMP: 2.10.18