Python Prime finder on the Raspberry Pi

Tryzan50 · 9 Aug 2014 at 19:43

Hi all,
I've been looking for primes on my pi (model b) using python. Here is my current program:

f = open("primesList", "w") #Opens a new blank file to write primes to
f.write("2 \n")

def primeCheck(number, primeList):
->prime = True
->for num in primeList:
->->if number % num == 0:
->->->prime = False
->->->break #Exits the for loop if it is not prime
->->if prime == True:
->->->primeList.append(number)
->->->f.write("%s \n" % str(number)) #Writes to the primeList file

primeList = [2]
number = 1 #Starts at one and then adds 2 to avoid all evens

while len(primeList) < 1000: #Will find the first 1000 primes (can be changed)
->number += 2
->primeCheck(number, primeList)

f.close()

I would like to find the first million primes as a project but, as I am using the pi, it is rather slow. How can I make it faster?

Many thanks

Haircut · 9 Aug 2014 at 20:26

You need a better algorithm.

What you have there runs in quadratic time (doubling the size of the input quadruples the time it takes), so it will take a long time to calculate the answer for an input of 1,000,000.
I bet it would still take a long time on a Core i7 due to the algorithm.

A simple and reasonably quick way of generating primes is using the Sieve of Eratosthenes
Note that this algorithm finds all the primes up to a certain value rather than the first n primes.
If you can find an upper bound for the nth prime (a quick google should help you here!) then you should be able to use a sieving algorithm to do what you want much faster.

Once you get a better algorithm then there are some further optimisations you could do too.
You're initialising the array with a single value and keep on appending to it. I don't know how Python deals with arrays, so this may be wrong, but in most programming languages it will have to allocate more memory for the array as it grows and copy the old values into the new array. If you know you are going to put a million values into it, then you can (assuming here, not knowing Python!) create the array with a specified size so it doesn't need to do the re-allocating and copying.

Secondly you're writing to the file inside your for loop.
It would be better to simply keep it all in memory until the end and then write it to the file then, so you're only accessing the file system once.

(Waits for D.P. to come in here and complain about programmers not being able to do maths/algorithms

)

Tryzan50 · 9 Aug 2014 at 22:16

Thank you for the very detailed reply

Haircut said:
You're initialising the array with a single value and keep on appending to it.

Could I fix this by filling the array with values 2 - 15,485,863 (the millionth prime) at the start of the program using this code:

for x in range(2, 15485863):
->numbersList.append(x)

Then go through the list with the sieve and delete any numbers that are found to be not prime so the list would become shorter with each pass of the sieve - on the first pass the list would start at 15,485,863 values long and end at around 7.5million long and so on?

I will have a go at the Sieve of Eratosthenes in python and post results here.

Thanks again

peterwalkley · 10 Aug 2014 at 09:59

Continuously resizing the array will be slower than just doing a single scan at the end to fetch your results out.

As a general learning point I would highly recommend the algorithms 1 and 2 courses on www.coursera.org taught by Bruce sedgewick of Princeton.

Tryzan50 · 10 Aug 2014 at 10:28

Thanks for the advice peterwalkley I have now changed my program accordingly. I will have a look at those courses.

Program nearly ready

Tryzan50 · 10 Aug 2014 at 11:07

Here it is! It is extremely fast, for an input of 100,000 my old program takes around 1min10sec and the new one takes 1.5 seconds!

def eratosthenes(limit):
->valuesList = [False] * 2 + [True] * (limit-1)
->for x in xrange(int(limit**0.5 + 1)):
->->if valuesList[x] == True:
->->->for x in range(x**2, limit + 1, x):
->->->->valuesList[y] = False
->for z in xrange(limit + 1): #Writes to the file at end of program
->->if valuesList[z] == True:
->->->primesFile.write("%s \n" % str(z))

primesFile = open("MorePrimes", "w")
eratosthenes(1000000)

Thanks for all the help guys, any other suggestions for speed?

D.P. · 11 Aug 2014 at 03:38

Use the sieve of Atkins, it is one.of the fastest and runs in sub linear time.
http://en.m.wikipedia.org/wiki/Sieve_of_Atkin

Next, change languages python is terrible for this kind of thing and is best left for scripting and prototyping. A good c implementation is likely 10-20 times faster.

Then do some profiling and see if there is a way to optimize from there. There are lots of small tricks, e.g when squaring it is better to multiply than use a pow function.

But really, what do you want to achive?
I would concentrate on better coding practices, e.g, there are better ways of doing file IO in python (with open)

Dj_Jestar · 11 Aug 2014 at 10:05

Tryzan, when posting code please use [CODE] tags. It makes it much more readable and you don't need to create "fake tabs" for indentation.

E.g. the code from you last post:

Code:

def eratosthenes(limit):
    valuesList = [False] * 2 + [True] * (limit-1)
    for x in xrange(int(limit**0.5 + 1)):
        if valuesList[x] == True:
            for x in range(x**2, limit + 1, x):
                valuesList[y] = False
    for z in xrange(limit + 1): #Writes to the file at end of program
        if valuesList[z] == True:
            primesFile.write("%s \n" % str(z))

Tryzan50 · 11 Aug 2014 at 14:38

D.P. said:
Next, change languages python is terrible for this kind of thing and is best left for scripting and prototyping. A good c implementation is likely 10-20 times faster.

Python is the only language I know, apart from HTML and CSS, and I don't have the time right now to learn a new language

D.P. said:
I would concentrate on better coding practices, e.g, there are better ways of doing file IO in python (with open)

Do you know of somewhere where the best coding practices are laid out?

Dj_Jestar said:
Tryzan, when posting code please use [CODE] tags.

Thank you, I was unaware that this was available

D.P. · 11 Aug 2014 at 15:36

Python is fun and easy but it is not at all fast, so once you have picked the best algorithm I would forget about any performance really. You could still look at a few micro-optimization but these are trickier with python because A), no one who codes python really cares, B) the python interpreter may do a lot of magic.
E.g. y= x*x may be faster than y=x**2. The latter is a power op and in most languages these are slow (in C/C++ pow(x,2) is much slower than x*x ). But I have no idea if the python interpreter will encode **2 as a simple mul operator.

I don't know python well enough to know good resources but the python help pages themselves tend to tell you about good practices.

Tryzan50 · 11 Aug 2014 at 15:57

I agree, python I used mainly by beginners who do not need speed for their simple programs and therefore don't worry about it. I will try x*x instead of x**2 and edit this post with results.
EDIT:
Using x**2 for an input of 1000000 took an average of 11.957s
Using x*x for an input of 1000000 took an average of 10.385s

I will look at the help pages, thanks for putting me in the right direction.

diper · 20 May 2016 at 02:20

You may be interested

Code:

import bitarray
def sieve(n):
    """bitarray sieve"""
    b=bitarray.bitarray([1]*(n+1))
    x=(int((n**.5))+1)
    for k in range(2,x):
        if b[k]:
            f=k+1 #thread
            e=(n//k)+1 #thread
            for g in range(f,e):
                if b[g]:
                    b[(g*k)]=0
    b=handle_squares(b)
    return b
                    
    
def handle_squares(ba):
    for s in range(2,int(len(ba)**.5)):
        if ba[s]:
            n=1
            sn=s**2
            while sn<len(ba):
                ba[sn]=0
                n+=1
                sn=s**(2*n)
    return ba

Tryzan50, I was somewhat obsessed with this these types of sieves for a number of months. The above is the fastest I could come up with. Your sieve is faster than this. Kudos and thank you. Nevertheless I found this pattern interesting.

-Hope you like it