Perf tools evaluation and proposal

You are here:

This discussion is connected to the gegl-developer-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.

This is a read-only list on gimpusers.com so this discussion thread is read-only, too.

6 of 6 messages available

Toggle history

Please log in to manage your subscriptions.
Perf tools evaluation and proposal	Henrik Akesson	17 Jun 11:37
Perf tools evaluation and proposal	Boudewijn Rempt	17 Jun 11:52
Perf tools evaluation and proposal	Henrik Akesson	17 Jun 12:44
Perf tools evaluation and proposal	jcupitt@gmail.com	20 Jun 23:15
Perf tools evaluation and proposal	Henrik Akesson	22 Jun 11:38
Perf tools evaluation and proposal	jcupitt@gmail.com	22 Jun 12:00

Perf tools evaluation and proposal

Hi,

here's the promised evaluation of the current profiling tools and a proposal.

Regards,

Henrik

TOOLS

Valgrind - retained Oprofile - retained
gprof, sprof - obsoleted by oprofile ltrace, ptrace - not capable of profiling dynamically loaded objects (dlopen-ed) GEGL instrumentation - same functionality is achieved by the below proposition.

Further info: http://sites.google.com/site/computerresearcher/profiling-tools/

################################################################################ PART ONE - VALGRIND - Where the time is spent ################################################################################

Current tools are capable of producing an abundance of profiling data.

Callgrind (with Cachegrind activated) will produce 13 different measurements for every line of code that has been executed during the program run:
1 on how many times the line has been executed 8 on cache use
4 on branching

Running callgrind on a command line gegl program that runs gaussian-blur produces profiling data for 149 different objects (89 of which are gegl operations that get loaded)!!!

NOTE: A serious limitation of Valgrind is that it can only count events, it cannot tell you how much time things take (such as cache miss or execution time for an instruction). This is because it heavily modifies the code before running it, and therefore renders any time measurements useless.

I propose to implement a tool that allows the user to (step 1) select the data he is interested in and (step 2) presents the results in a easy-to-understand way.
Step 1 - SELECTION - user workflow:
1a) Select the libraries of interest 1b) Select the entry/exit function (normally the main function), i.e. only data measured inside this function (including calls to other functions) is displayed
1c) Hotpath elaboration. I.e. display and selection of the code execution path to display. (TBD)

Step 2 - EVALUATION - workflow: 2a) Code annotation. I.e. display the above selected code with measurements 2b) Trend display

Step 3 - MANAGEMENT - workflow: 3a) Adding new data (cmd line and web) 3b) Adding new evaluation scenarios (web) 3c) Listing data
3d) Deleting data
3e) Listing scenarios
3f) Deleting scenarios
3g) Adding scenarios

################################################################################ PART TWO - OPROFILE - What the processor is doing ################################################################################

An important limitation of the processor's performance counters is that from a choice of about 100, there are only 2-8, depending on processor model and mark, that can be used in the same moment. There has been some attempts by some researchers to multiplex them, but this requires modification of kernel code and thus is not accessible for the ordinary mortal developer.

I propose to: a) By repeatedly running the test/performance case assemble all the important data into the database
b) Possibly there will be a need to use statistics in order to have reliable data (i.e. determine which distribution and use this during data collection) TBD.
c) Add groups of annotation data to the point 2a. Basically groups such as "L1 Cache use", "L2 Cache use", "Vector extensions" etc...

Normally the data should be stored internally in the same format as for valgrind. This means that only an import tool needs to be written in order to add oprofile data.

################################################################################ IMPLEMENTATION DETAILS
################################################################################

I propose a web-based implementation using jRuby and Ruby on the Rails together with a sql database (could potentially be an object database = db4o).

For the code annotation, I propose to use the existing subversion repository for retreiving the code to be annotated and GNU source highlight.

Perf tools evaluation and proposal

On Wed, 17 Jun 2009, Henrik Akesson wrote:

I propose to implement a tool that allows the user to (step 1) select the data he is interested in and (step 2) presents the results in a easy-to-understand way.

Doesn't kcachegrind provide all of this for you?

Boudewijn

Perf tools evaluation and proposal

The main differences of my solution compared to KCachegrind are:

I would use a database approach allowing to display performance history. I.e. in the case of a drop in performance, a developer would be able to localise it to a certain commit. Also useful for comparing the performance of two different solutions during development.

The aggregation of counter, IMHO is very important for understanding what the processor is doing. It is a very difficult task, and just having data from 2-8 counters is like providing only a small window on a big complex image.

I think that you could summarize kcachegrind as a tool that shows you all the data whereas the approach I propose is to show as little data as possible in order to localise performance issues and when found showing the full image of what the processor is doing.

IMHO KCachegrind is not easy to use nor understand (ex: what does "Distance 6-11 (6)" mean? etc...)

My solution is web based.

I believe that this is enough of differences to justify a new tool. Is it to you?

Henrik

2009/6/17 Boudewijn Rempt :

On Wed, 17 Jun 2009, Henrik Akesson wrote:
I propose to implement a tool that allows the user to (step 1) select the data he is interested in and (step 2) presents the results in a easy-to-understand way.
Doesn't kcachegrind provide all of this for you?
Boudewijn

Perf tools evaluation and proposal

Hi Henrik,

2009/6/17 Henrik Akesson :

I believe that this is enough of differences to justify a new tool. Is it to you?

OK, though I hope the project is not becoming too ambitious. One of the reasons we all liked your original proposal so much was that it seemed focused and clearly achievable. I would be slightly concerned that you are maybe looking too much at the presentation of results and not enough on getting the numbers to start with.

How about as a next step setting up a small test suite, a series of automated measurements, and a simple reporting generator? Leave the fancy web interface until the basic tests and measurements are working. This would also give you something concrete to show at the mid-term evaluation.

John

Perf tools evaluation and proposal

John,

Hmm, I'm not sure I've expressed myself clearly...

Currently, I have a simple ruby test harness and a guassian-blur test that I use to generate test data with Valgrind. For the valgrind part, I plan on using all the measurements callgrind + cachegrind can provide. Oprofile is much more complicated (as there's a choice of roughly 80 measurements and because it's based on statistical sampling).

This is why I focus on valgrind for mid-term. Paying attention to your concern, I propose to start with a very simple web-interface (using text when possible instead of fancy graphics etc...) and then working towards improving it.

However, I will not have the time to set up a test suite before mid-term. Therefore I propose to start that ASAP after the mid-term.

I intend to produce a fully functional suite based on Valgrind before starting with oprofile, as I estimate the latter as the "high-risk" part of the project. Like that I can guarantee you a working tool-set using valgrind and hopefully also something that can provide the "bells-and-whistles" of oprofile.

Does this seem reasonable to you?

Henrik

2009/6/20 :

Hi Henrik,
2009/6/17 Henrik Akesson :
I believe that this is enough of differences to justify a new tool. Is it to you?
OK, though I hope the project is not becoming too ambitious. One of the reasons we all liked your original proposal so much was that it seemed focused and clearly achievable. I would be slightly concerned that you are maybe looking too much at the presentation of results and not enough on getting the numbers to start with.
How about as a next step setting up a small test suite, a series of automated measurements, and a simple reporting generator? Leave the fancy web interface until the basic tests and measurements are working. This would also give you something concrete to show at the mid-term evaluation.
John

Perf tools evaluation and proposal

Hi again Henrik,

2009/6/22 Henrik Akesson :

Currently, I have a simple ruby test harness and a guassian-blur test that I use to generate test data with Valgrind.

Ah, OK, sorry, I'd only seen the report, I didn't realise you had something concrete going already. I was worried there might not be much to see at mid-term.

Your proposal sounds good to me, well done.

However, I will not have the time to set up a test suite before mid-term. Therefore I propose to start that ASAP after the mid-term.

On the performance tests, a senario I've used in the past is described here:

http://www.vips.ecs.soton.ac.uk/index.php?title=Benchmarks

It's derived from a real application which took high-resolution master images off a server and generated files for printing on a large-format inkjet. It might be a bit fiddly to implement. A much simpler version is here:

http://www.vips.ecs.soton.ac.uk/index.php?title=Speed_and_Memory_Use

This is just: load, crop, shrink, sharpen, save. It's not very demanding, but it is very easy to implement, and typical of applications like Picasa or F-Spot. It might be helpful I guess.

I'm sure the gimp list could suggest an application benchmark that would be typical of GEGL use in Gimp.

John

Latest comments

Daniel
about 3 years ago

After many years your tutorial is still very helpful! Thanks a lot! Smoke and Flames
Chul, Kim
about 3 years ago

t's my first time seeing this program. GIMP 2.10.18 (Windows)
PatrickEdwards
about 3 years ago

good to see Create a new species by merging two or more animals!
SPIKETHMAN
over 3 years ago

I WANT TO CREATE A CREATURE Create a new species by merging two or more animals!

Latest news

Featured content

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Current versions

Download latest stable
GIMP: 2.10.18

Advertisement

10 latest tutorials

Latest comments

10 random tutorials

Latest discussions

Latest news

Featured content

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Perf tools evaluation and proposal

Current versions

Download latest stable GIMP: 2.10.18

Advertisement

10 latest tutorials

Latest comments

10 random tutorials

Latest discussions

Download latest stable
GIMP: 2.10.18