[PATCH 1/4] Tile caching performance patches
This discussion is connected to the gimp-developer-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.
This is a read-only list on gimpusers.com so this discussion thread is read-only, too.
[PATCH 0/4] Tile caching performance patches
While working on my resampler, I noticed the tile cache often got itself into near-deadlock situations when it was actually working under moderate cache pressure. I have a series of four tile cache performance patches following this mail; one is a simple one-liner that removes a few integer divisions, the second adds a decent amount of profiling collection and output to the tile cache, the third fixes the tile cache strategy and the fourth addresses bugs and tuning problems with the idle swapper. Along with these patches, I've also written automated test/profiling scripts.
Detailed description of the patches, test scripts and some profilng
results can be found at
http://web.mit.edu/xiphmont/Public/gimp-fu/gimp-cache.html
The patches are against the gnome master, but I believe they can be [and should be] applied to 2.6 as well.
Cheers, Monty
[PATCH 1/4] Tile caching performance patches
Patch attached [to avoid any chance of gmail mangling lines]
Detailed patch description at: http://web.mit.edu/xiphmont/Public/gimp-fu/gimp-cache.html
Monty
[PATCH 2/4] Tile caching performance patches
Patch attached [to avoid any chance of gmail mangling lines]
Detailed patch description at: http://web.mit.edu/xiphmont/Public/gimp-fu/gimp-cache.html
Monty
[PATCH 3/4] Tile caching performance patches
Patch attached [to avoid any chance of gmail mangling lines]
Detailed patch description at: http://web.mit.edu/xiphmont/Public/gimp-fu/gimp-cache.html
Monty
[PATCH 4/4] Tile caching performance patches
Patch attached [to avoid any chance of gmail mangling lines]
Detailed patch description at: http://web.mit.edu/xiphmont/Public/gimp-fu/gimp-cache.html
Monty
[PATCH 1/4] Tile caching performance patches
Hi,
first of all thanks a lot for providing these patches. I definitely want to get them merged as soon as possible. But there are a few minor issues that should be discussed first. So let me start by commenting on your first patch:
On Tue, 2009-06-02 at 04:11 -0400, Christopher Montgomery wrote:
#define TILE_DATA_POINTER(tile,x,y) \ ((tile)->data + \
- (((y) % TILE_HEIGHT) * (tile)->ewidth + ((x) % TILE_WIDTH)) * (tile)->bpp)
-
+ (((y) & (TILE_HEIGHT-1)) * (tile)->ewidth + ((x) & (TILE_WIDTH-1))) * (tile)->bpp)
As far as I know pretty much any compiler out there should be able to replace a modulo by a power-of-2 constant by the bit-wise AND operation without us explicitly doing so (see also http://en.wikipedia.org/wiki/Modulo_operation#Performance_issues). So for the benefit of readable code I suggest that we keep the code as it is.
Sven
[PATCH 2/4] Tile caching performance patches
Hi,
On Tue, 2009-06-02 at 04:12 -0400, Christopher Montgomery wrote:
+#ifdef TILE_PROFILING
+#include
If we use GTimeVal instead of struct timeval, we can avoid this include (and a possible portability problem).
+#ifdef TILE_PROFILING
+ if ((cur_cache_size + tile->size) > max_cache_size){ + struct timeval now;
+ struct timeval later;
+ gettimeofday(&now,NULL);
Please use GTimeVal and g_get_current_time() here.
+ if(tile_total_interactive_usec < 0){ + tile_total_interactive_usec += 1000000; + tile_total_interactive_sec--; + }
These lines don't adhere to the coding style guidelines. Please add a space after the if and move the opening curly bracket to the next line. There are other places in your patches that need similar fixes.
Other than these nit-picking style issues, the patch looks good to me. Nice work!
Sven
[PATCH 1/4] Tile caching performance patches
On Tue, Jun 2, 2009 at 4:46 PM, Sven Neumann wrote:
Hi,
first of all thanks a lot for providing these patches. I definitely want to get them merged as soon as possible. But there are a few minor issues that should be discussed first. So let me start by commenting on your first patch:
On Tue, 2009-06-02 at 04:11 -0400, Christopher Montgomery wrote:
#define TILE_DATA_POINTER(tile,x,y) \ ((tile)->data + \
- (((y) % TILE_HEIGHT) * (tile)->ewidth + ((x) % TILE_WIDTH)) * (tile)->bpp)
-
+ (((y) & (TILE_HEIGHT-1)) * (tile)->ewidth + ((x) & (TILE_WIDTH-1))) * (tile)->bpp)As far as I know pretty much any compiler out there should be able to replace a modulo by a power-of-2 constant by the bit-wise AND operation without us explicitly doing so (see also http://en.wikipedia.org/wiki/Modulo_operation#Performance_issues). So for the benefit of readable code I suggest that we keep the code as it is.
Interesting. I got a noticable and repeatable performance benefit. Which is not to say I haven't somehow mismeasured it. I agree the modulo is more readable.
...perhaps the difference is the difference of (x) or (y) possibly being negative and additional conformance-related assembly getting generated? I suppose there's no reason to speculate, I'll go read the assembly gcc generates and that will answer everything, at least for me.
Monty
[PATCH 2/4] Tile caching performance patches
+#ifdef TILE_PROFILING
+#includeIf we use GTimeVal instead of struct timeval, we can avoid this include (and a possible portability problem).
Agreed.
These lines don't adhere to the coding style guidelines. Please add a space after the if and move the opening curly bracket to the next line. There are other places in your patches that need similar fixes.
I'll make that change too and resubmit.
Monty
[PATCH 3/4] Tile caching performance patches
Hi,
a few more coding style despite the ones I already pointed out:
+ if(!tile)return FALSE;
Please write this as
if (! tile) return FALSE;
+ if(PENDING_WRITE(t))
+ acc+=t->size;
+ t=t->next;
Please insert an empty line like this:
if (PENDING_WRITE (t)) acc += t->size;
t = t->next;
Other than these style issues, the patch looks good. In particular your benchmarking data looks promising. Nice work!
Sven
[PATCH 4/4] Tile caching performance patches
Hi,
On Tue, 2009-06-02 at 04:13 -0400, Christopher Montgomery wrote:
-#define IDLE_SWAPPER_TIMEOUT 250 -
+#define IDLE_SWAPPER_START 1000 +#define IDLE_SWAPPER_INTERVAL 20
+#define IDLE_SWAPPER_TILES_PER 10
Should that constant perhaps better be called IDLE_SWAPPER_TILES_PER_RUN ?
Sven
[PATCH 4/4] Tile caching performance patches
On Tue, Jun 2, 2009 at 5:01 PM, Sven Neumann wrote:
Hi,
On Tue, 2009-06-02 at 04:13 -0400, Christopher Montgomery wrote:
-#define IDLE_SWAPPER_TIMEOUT 250 -
+#define IDLE_SWAPPER_START 1000 +#define IDLE_SWAPPER_INTERVAL 20
+#define IDLE_SWAPPER_TILES_PER 10Should that constant perhaps better be called IDLE_SWAPPER_TILES_PER_RUN ?
Or PER_INTERVAL, sure.
Monty
[PATCH 1/4] Tile caching performance patches
Hi,
On Tue, 2009-06-02 at 16:56 -0400, Christopher Montgomery wrote:
As far as I know pretty much any compiler out there should be able to replace a modulo by a power-of-2 constant by the bit-wise AND operation without us explicitly doing so (see also http://en.wikipedia.org/wiki/Modulo_operation#Performance_issues). So for the benefit of readable code I suggest that we keep the code as it is.
Interesting. I got a noticable and repeatable performance benefit. Which is not to say I haven't somehow mismeasured it. I agree the modulo is more readable.
...perhaps the difference is the difference of (x) or (y) possibly being negative and additional conformance-related assembly getting generated? I suppose there's no reason to speculate, I'll go read the assembly gcc generates and that will answer everything, at least for me.
I might very well be wrong here. If there's indeed a difference in the generated assembly and a noticeable performance benefit, than let's use the optimized macro. But perhaps we can add a short comment there explaining that ((y) & (TILE_HEIGHT-1)) is equivalent to ((y) % TILE_HEIGHT). Not everyone reading this code will be aware of this immediately.
Sven
[PATCH 1/4] Tile caching performance patches
I'd like to mention also that there are also some minor problems with whitespace
"
ion@gbubuntu:~/st/gimp2/gimp$ git-am /tmp/0002*.patch
Applying Add additional profiling to tile usage in order to analyze
efficiency and behavior of the tile cache. Profiling includes run-time
indication of idle swapper activity.
.dotest/patch:193: trailing whitespace.
guint zorched : 1; /* was the tile flushed due to cache pressure
.dotest/patch:255: trailing whitespace.
#endif
.dotest/patch:304: trailing whitespace.
#ifdef TILE_PROFILING
.dotest/patch:318: trailing whitespace.
.dotest/patch:319: trailing whitespace.
#ifdef TILE_PROFILING
warning: squelched 12 whitespace errors
warning: 17 lines add whitespace errors.
ion@gbubuntu:~/st/gimp2/gimp$ git-am /tmp/0003*.patch
Applying Replace two list 'flush clean first' cache strategy with an
LRu strategy. Although the clean-first strategy gives fast light-load
performance, it also degrades catastrophically under moderate cache
pressure. LRU is not as efficient under light load, but degrades more
gracefully under moderate and heavy load.
.dotest/patch:148: trailing whitespace.
.dotest/patch:191: trailing whitespace.
.dotest/patch:196: trailing whitespace.
.dotest/patch:202: trailing whitespace.
.dotest/patch:205: trailing whitespace.
warning: squelched 8 whitespace errors
warning: 13 lines add whitespace errors.
ion@gbubuntu:~/st/gimp2/gimp$ git-am /tmp/0004*.patch
Applying Correct startup flaw in idle swapper start: Don't watch only
UI idling, but also watch that the cache itself is idle. Previously it
would start during transforms and long pyramid rendering ops and toss
writes and large seeks into the tile cache while it was potentially
under heavy pressure.
.dotest/patch:149: trailing whitespace.
.dotest/patch:157: trailing whitespace.
.dotest/patch:158: trailing whitespace. if(count>=IDLE_SWAPPER_TILES_PER) .dotest/patch:186: trailing whitespace.
.dotest/patch:194: trailing whitespace.
warning: squelched 1 whitespace error
warning: 6 lines add whitespace errors.
"
(patch 0001 applies with no problems.)
[PATCH 1/4] Tile caching performance patches
For about a month I'd turned on emacs's trailing whitespace autotrim and it was a cure worse than the disease. How shall I kill my own whitespace without generating patches 4x larger than necessary due to others' trailing whitespace?
Monty
[PATCH 1/4] Tile caching performance patches
On Tue, Jun 2, 2009 at 10:15 PM, Christopher Montgomery wrote:
For about a month I'd turned on emacs's trailing whitespace autotrim and it was a cure worse than the disease. How shall I kill my own whitespace without generating patches 4x larger than necessary due to others' trailing whitespace?
[best I've struck on is manually running M-x delete-trailing-whitespace on each file, followed by git format-patch -b. Surely for something that can be done completely mechanically, there's some better way...]
Monty
[PATCH 1/4] Tile caching performance patches
In fact, with -O2, gcc is generating more complex assembly for % than &, though not an integer division. Assembly generated for version using &:
tile_data_pointer:
.LFB29:
movzwl 8(%rdi), %eax
andl $63, %edx
andl $63, %esi
imull %eax, %edx
movzbl 7(%rdi), %eax
addl %esi, %edx
imull %eax, %edx
movslq %edx,%rax
addq 24(%rdi), %rax
ret
assembly generated for version using %:
tile_data_pointer:
.LFB29:
movl %edx, %eax
sarl $31, %eax
shrl $26, %eax
addl %eax, %edx
andl $63, %edx
subl %eax, %edx
movzwl 8(%rdi), %eax
imull %eax, %edx
movl %esi, %eax
sarl $31, %eax
shrl $26, %eax
addl %eax, %esi
andl $63, %esi
subl %eax, %esi
movzbl 7(%rdi), %eax
addl %esi, %edx
imull %eax, %edx
movslq %edx,%rax
addq 24(%rdi), %rax
ret
On Tue, Jun 2, 2009 at 5:09 PM, Sven Neumann wrote:
Hi,
On Tue, 2009-06-02 at 16:56 -0400, Christopher Montgomery wrote:
As far as I know pretty much any compiler out there should be able to replace a modulo by a power-of-2 constant by the bit-wise AND operation without us explicitly doing so (see also http://en.wikipedia.org/wiki/Modulo_operation#Performance_issues). So for the benefit of readable code I suggest that we keep the code as it is.
Interesting. I got a noticable and repeatable performance benefit. Which is not to say I haven't somehow mismeasured it. I agree the modulo is more readable.
...perhaps the difference is the difference of (x) or (y) possibly being negative and additional conformance-related assembly getting generated? I suppose there's no reason to speculate, I'll go read the assembly gcc generates and that will answer everything, at least for me.
I might very well be wrong here. If there's indeed a difference in the generated assembly and a noticeable performance benefit, than let's use the optimized macro. But perhaps we can add a short comment there explaining that ((y) & (TILE_HEIGHT-1)) is equivalent to ((y) % TILE_HEIGHT). Not everyone reading this code will be aware of this immediately.
Sven
[PATCH 1/4] Tile caching performance patches
Christopher Montgomery wrote:
For about a month I'd turned on emacs's trailing whitespace autotrim and it was a cure worse than the disease. How shall I kill my own whitespace without generating patches 4x larger than necessary due to others' trailing whitespace?
Caring too much about trailing whitespace is IMO a good way to waste time. Just make sure to not commit trailing whitespace, and remove trailing whitespace on lines you change, and that's fine. Having an editor automatically remove trailing whitespace on save just breaks git-blame and pollutes diffs.
/ Martin
[PATCH 1/4] Tile caching performance patches
Hi,
On Wed, 2009-06-03 at 11:41 +0930, David Gowers wrote:
I'd like to mention also that there are also some minor problems with whitespace
Right. I suggest to add the following lines to your .emacs file:
(setq c-mode-common-hook '(lambda () (setq indent-tabs-mode nil) (setq show-trailing-whitespace "true")))
Sven
[PATCH 1/4] Tile caching performance patches
Hi,
On Tue, 2009-06-02 at 22:15 -0400, Christopher Montgomery wrote:
For about a month I'd turned on emacs's trailing whitespace autotrim and it was a cure worse than the disease. How shall I kill my own whitespace without generating patches 4x larger than necessary due to others' trailing whitespace?
There should not be any trailing whitespace in the GIMP source code. We have several times trimmed away all trailing whitespace and committed these cleanups. If new trailing whitespace sneaked in, then we should probably do that again. Now that git warns about trailing whitespace, we have a good chance to finally end this disease.
I will commit another global whitespace cleanup later today.
Sven
[PATCH 1/4] Tile caching performance patches
On Wed, Jun 3, 2009 at 3:12 AM, Sven Neumann wrote:
There should not be any trailing whitespace in the GIMP source code. We have several times trimmed away all trailing whitespace and committed these cleanups. If new trailing whitespace sneaked in, then we should probably do that again.
You're right and I was vaguely astounded to see that.
'Show whitespace' seems like the right solution, not forcing anything.
[BTW, did I manage to avoid any slipups in the new patches?]
Monty