Saturday, 23 February 2013

SubsetSum@Home for the Pi

I had a request to try port SubsetSum for Raspberry Pi (ARMv6l). I've managed to get it done.

You can find the project files over on my Raspberry Pi project page or use this direct link:

Successful task completion:

Thursday, 21 February 2013

A journey from FORTRAN to C and OpenCL

You may have noticed a reduction in blog posts of late. The cause is a GPU porting project I've undertaken for theSkyNet POGS. I guess I should post something about it, in case some of you are interested.

As you may or may not know, the guts of the main POGS application is MAGPHYS. Basically,  this science application loops through library files to find a best fit of different attributes for a given image pixel. The problem with the existing client is the main section of code loops through in a brute-force sequential kind of fashion. Hence, the application by default is not very "parallel" and requires re-work to parallalise and produce a successful GPU port.

We decided that converting from FORTRAN (F77) to C/C++ initially would be the best option as most of the GPU platform frameworks uses some derivation of C99. This is not to say that frameworks such as OpenCL and CUDA don't support other languages, it was just cleaner this way. Our choice of GPU framework was OpenCL.

The actual port from FORTRAN to C was fairly straight-forward, however, there were quite a few little "gotchas". This included things like: array indexing differences - FORTRAN starts at 1; floating point problems - FORTRAN does not do nearest-even rounding; and, overall output "weirdness" attributed to FORTRAN and C differences. None of these really kept me bogged down for too long. It just meant a lot of debugging and customised functions to get around them. These are temporary since the goal is to eventually move over to the C client only.

For the GPU port, it was a bit of a learning curve being the first time implementing OpenCL kernel code. I've worked with things like threading before, however, parallelising code for processing on a GPU was new to me. After hours of reading, I dove right in and started coding. 

Writing the C code to prepare the OpenCL program, kernel, devices etc... and run the kernel was fairly easy. The tricky part was re-working some of the data that was going to be buffered into device memory and read back later. I decided that batching up sets of library models for the kernel threads to crunch was the best option. This essentially meant parsing the library model arrays to the device memory once and allocating space in device memory for the kernel threads to output data. At the end of the batch I would read back memory from the device and have a C based loop consolidate those results appropriately.

At present, we get a 3 - 4 time speed increase using the OpenCL implementation on modern machines with modern cards. There is a bit more work to increase that slightly by parallelizing the post batch part that accumulates output results. I find the biggest bottleneck to be reading large quantities of memory back from the device. If I can do the post batch stuff inside the device, I can reduce the amount of memory to be read back from the device before moving onto the next batch of models.

Overall, the porting process has been a great learning exercise. I'm hoping that in the next month the application can be stabilised and released to the public to crunch POGS on their GPUs. This needs approval from the project leader of course. I know there are a few more hurdles to overcome, but I'm optimistic.

Regarding my little cruncher projects (Raspberry Pi and ODROID), I will try and get back into them. I have 4 subjects coming up in Semester 1 so I'm guessing that this will drown most of my “play time”.

Thursday, 14 February 2013

Pete's Blog is useful for crunching on ARM

Just wanting to plug Pete's Blog over at:

He's got a section dedicated to Raspberry Pi and has been getting BOINC running on other ARM devices like the Cubieboard. Go check it out!

Saturday, 9 February 2013

An ATX power option

Some of you may already know how to do this, however, if you're looking for some 5V power for your Raspberry Pis, ODROIDs, etc... Wiring up an ATX power supply is one option.

Here's some photos to illustrate how. Note, this is not a permenent solution for me. I suspect anyone doing this for something permanent would do it a lot cleaner and with a very efficient power supply. I use this for quick 12V and 5V power for testing. See this video for proper setup:

My 650W standby power supply with 5V 30A rail.

Using 5V from molex connector to micro USB cable. It's soldered, I promise! Depending on your power supply you may need to draw something from the 12V circuit for it to actually work.

Shorting pins so the PSU actually feeds power to those connectors on when you flick the power switch on the back of the PSU. 

Sunday, 3 February 2013

Where's the ODROID update?

Sorry I haven't been keeping you up to date on how my ODROID-U2s have been going. I've been busy with school work, coming up to the end of my summer unit.

I've only loaded POGS on it so far to see what sort of RAC I get: Around 1000. Fairly low. I've been meaning to recompile that binary and also test some other projects. Should probably load some NCI projects on there (OProject and WUProp).

I've yet to load up my second U2 with Android to give NativeBOINC a shot, but I will shortly. I will! I need to got get some correct fitting DC jacks. I'm not repeating what I did with the current one that's on and I don't wish to share it due to the dodgy levels.

In the meantime, I thought I'd post some links Ray_GTI-R has been sharing as they're quite useful. In case you don't know, he's fairly active in the community when it comes to ARM crunchers. Ray, hope you don't mind me giving you a plug.

Ray's links

His NativeBOINC U2s in the lead:

5V distribution board for his gear:
Fully fused distribution board for 5V 10A PSU. Each output has a 2.5A slow-blow fuse (all Maplin items) to each ODROID.

5V power supply:  I've actually just purchased one of these!

DC jacks for U2:
Maplin L43AY

Power supply warning:

Cheers Ray for sharing your experiences with us!