matrix transposistion in java

No Problem. Curiosity got the better of me and I looked up the source code of these new methods to find out how it would be done before Java 6:

Code:
public static char[] copyOfRange(char[] original, int from, int to) {
        int newLength = to - from;
        if (newLength < 0)
            throw new IllegalArgumentException(from + " > " + to);
        char[] copy = new char[newLength];
        System.arraycopy(original, from, copy, 0,
                         Math.min(original.length - from, newLength));
        return copy;
    }
 
It is a question I've been pondering after I made the change. I was half tempted to start an aesthetics/readability vs performance thread last night.

It was only when I thought why directly use an array that speed dawned on me. With the type of method it is efficiency is definitely better.

If you're still feeling curious could you try a couple of changes on your test harness?

1) Replace the StringBuffer with a StringBuilder. I'd be interest to know what kind of speedup not having synchronised methods makes.

2) Initialise it with StringBuilder(array[0].length*array.length) to prevent incremental enlargement. It grows at (currentsize * 2 + 2) when full and the default size is 16 so there will be a fair few resizes on larger arrays.

I wonder why the 1M iterations test has a much lower performance gap. The percentage difference was growing - 32, 35, 40, 8. Might be the gc kicking in to free some memory.
 

Ok I tested four different methods, the first was my original method with the array, second was yours with a StringBuffer, third was yours witha StringBuilder and fourth was yours with an initialised StringBuilder.

I tested each method over 100, 1000, 10000, 100000 and 1000000 iterations ten times then took an average from those ten results to plot the graph below:

performance.jpg


You can see that the array method is still the best in terms of performance, however as the number of iterations increase, the StringBuilders come close to the performance and at 10,000 iterations slightly outperform the array method. The difference between an uninitialised StringBuilder and an initialised StringBuilder is more noticable over fewer iterations, this may be more noticable if the string was more than 16 characters as 16 is the default initialisation for a StringBuilder. The StringBuffer was the worst performer overall, this would seem to suggest that non-synchronised methods offer a higher performance.

This is a sample of the code I used to test it:

Code:
        char[] newArray = new char[][] { {'H','E','L','L','O'}, {'W','O','R','L','D'} };

        long start1 = System.nanoTime();
        for (int i=0; i < LOOP_NUM; i++) {
            multiCharToString_Array(newArray);
        }
        long end1 = System.nanoTime();
        System.out.println("multiCharToString_Array: "+(end1-start1)/LOOP_NUM);
 
Last edited:
I wonder what your graph is saying about how code is executed. I would have expected execution time to level off quickly as the code and data got cached. They're still falling after 10K iterations.

Anyway, it interested me enough to run some tests on larger arrays. I tested from 10x10 to 500x500 filled with random characters and rising in increments of 10.

Rather nicely shows the different approaches diverging with increasing array size. You can also see the jumps when the StringBuffer/Builders resize :)

 
Interesting results, I have thrown in string concatenation as a comparison to these more efficient methods. It would be interesting to see what happens when you throw a huge array at the method using string concatenation, I think it would take a significant performance hit.

performance-1.jpg
 
Last edited:
I installed java 1.6 on the command line so I can use StringBuilders outside Eclipse now.

Try your test again using the java -Xint switch to force use of the interpretive jvm. You should see those graph lines stay fairly flat. Must read up on how Hotspot's adaptive optimisations work.

I'll run the whole test this evening but for now on a 400 * 400 array
Code:
run	jvm	method	time (s)
from

shell	jit	array	0.004
shell	interp	array	0.02
eclipse	jit	concat	182
shell	jit	concat	420

Not sure why running from the shell was so much slower. My guess is Eclipse sets the jvm memory limit higher than default and consequently needs to run the Garbage collector less frequently.
 
I've installed NetBeans 6 to try out the inbuilt profiler. On larger arrays using += the jvm spends ~80% of its time running the Garbage Collector. For the others it is ~0.4%

I rather like NetBeans except for an annoying glitch where popup windows, especially the one asking if I wish to save a snapshot of the profiling results, appear blank.

I modified my test class to use a one dimensional char array and then tested from 100 to 100,000 elements.

Code:
Headline figures for 100,000 elements

Method			time (milliseconds)
Array			0.72
Presized StringBuilder	1.39
Default StringBuilder	1.91
+=			48000

They are separated into two graphs as += is so slow the others wouldn't even register. They are both timed in milliseconds.

Graph for += string concatenation


In the below graph:
"SB" is a StringBuilder
"array copy" is equivalent to the multiCharToString_Array method. It iterates over the test char[] copying it into a temporary char[] then builds a string
"new String(array)" simply builds a string from the test char[]

 
Last edited:
Interesting results, I'm sure there are many programs out there written in Java that are using inefficient methods. I know that some people stick to one collection such as 'LinkedList' because they think it's faster than something like 'ArrayList' which is true in some circumstances but you have to understand the context it is being used in. Our discussion here shows how important minor changes to the same method can affect performance.

I currently use NetBeans 5.5, I might have to give 6 a try, it looks like it has some interesting features.
 
Our discussion here shows how important minor changes to the same method can affect performance.
Many moons ago I was a Software Developer and back in 99/00 I was working on a project for a major multinational.

Our system was like the borg steadily taking over as the authoritative repository for client information from a pile of legacy systems. Once we had all of a legacy system's client data associated functionality would be disabled on their system and any changes made through ours. Changes would be synced back to their systems through various means.

CORBA, COM (through a COM-CORBA bridge) for a smalltalk system, MQSeries messaging middleware for AS400s, email for some transactions! and for one crusty old mainframe we generated text files and used ftp.

It was these text files that ended up causing a problem. It had worked fine for a couple of years, as these things often do..

We had completed a component that would communicate with Dun and Bradstreet's Worldbase of company information. The aim was to update client information we held like addresses, industry codes etc..

First time we exchanged information with DnB it was done via couriered CD (encrypted I'm glad to say :D) as there were rather a lot of records and DnB were having trouble launching their automated online system. More ftp, no fancy schmancy api's exposed over the net in those days :p

One sunny Sunday morning I traipsed into the office with the entire team for a major release. New functionality, database changes and new servers. Import from DnB completed successfully and the system came back up.

Yey, easy work for a day in lieu :) Then one of our components went down.. restarted and it maxed out CPU for a while and crashed again on an out of memory exception. Hmmm, whacked up the max heap size and restarted. Waited.. and waited some more.. and then some more.. Project manager started mumbling about rollback strategies and cutoff times... waited some more.. and praise the lord after something like three hours it spat out a file.

The following day I located and fixed the problem though it would have to wait for the next release to go into production.

For reason's known only to the writer this particular component was generating the entire contents of a file in memory using Strings basically like

Code:
String tmp;
While (more lines) {
  tmp += someFileFormatterObject.getNextLine();
}
someOtherObject.writeFile(tmp);

I modified two lines of code and the execution time went from hours to imperceptible. I've been a StringBuffer evangelist ever since :)

lol, that turned out to be quite a long story.
 
Last edited:
Back
Top Bottom