Python Prime finder on the Raspberry Pi

Associate
Joined
15 Mar 2014
Posts
32
Hi all,
I've been looking for primes on my pi (model b) using python. Here is my current program:

f = open("primesList", "w") #Opens a new blank file to write primes to
f.write("2 \n")

def primeCheck(number, primeList):
->prime = True
->for num in primeList:
->->if number % num == 0:
->->->prime = False
->->->break #Exits the for loop if it is not prime
->->if prime == True:
->->->primeList.append(number)
->->->f.write("%s \n" % str(number)) #Writes to the primeList file

primeList = [2]
number = 1 #Starts at one and then adds 2 to avoid all evens

while len(primeList) < 1000: #Will find the first 1000 primes (can be changed)
->number += 2
->primeCheck(number, primeList)

f.close()


I would like to find the first million primes as a project but, as I am using the pi, it is rather slow. How can I make it faster?

Many thanks
 
Last edited:
You need a better algorithm.

What you have there runs in quadratic time (doubling the size of the input quadruples the time it takes), so it will take a long time to calculate the answer for an input of 1,000,000.
I bet it would still take a long time on a Core i7 due to the algorithm.

A simple and reasonably quick way of generating primes is using the Sieve of Eratosthenes
Note that this algorithm finds all the primes up to a certain value rather than the first n primes.
If you can find an upper bound for the nth prime (a quick google should help you here!) then you should be able to use a sieving algorithm to do what you want much faster.

Once you get a better algorithm then there are some further optimisations you could do too.
You're initialising the array with a single value and keep on appending to it. I don't know how Python deals with arrays, so this may be wrong, but in most programming languages it will have to allocate more memory for the array as it grows and copy the old values into the new array. If you know you are going to put a million values into it, then you can (assuming here, not knowing Python!) create the array with a specified size so it doesn't need to do the re-allocating and copying.

Secondly you're writing to the file inside your for loop.
It would be better to simply keep it all in memory until the end and then write it to the file then, so you're only accessing the file system once.

(Waits for D.P. to come in here and complain about programmers not being able to do maths/algorithms :p)
 
Last edited:
Thank you for the very detailed reply

You're initialising the array with a single value and keep on appending to it.
Could I fix this by filling the array with values 2 - 15,485,863 (the millionth prime) at the start of the program using this code:

for x in range(2, 15485863):
->numbersList.append(x)

Then go through the list with the sieve and delete any numbers that are found to be not prime so the list would become shorter with each pass of the sieve - on the first pass the list would start at 15,485,863 values long and end at around 7.5million long and so on?

I will have a go at the Sieve of Eratosthenes in python and post results here.

Thanks again :)
 
Here it is! It is extremely fast, for an input of 100,000 my old program takes around 1min10sec and the new one takes 1.5 seconds!

def eratosthenes(limit):
->valuesList = [False] * 2 + [True] * (limit-1)
->for x in xrange(int(limit**0.5 + 1)):
->->if valuesList[x] == True:
->->->for x in range(x**2, limit + 1, x):
->->->->valuesList[y] = False
->for z in xrange(limit + 1): #Writes to the file at end of program
->->if valuesList[z] == True:
->->->primesFile.write("%s \n" % str(z))

primesFile = open("MorePrimes", "w")
eratosthenes(1000000)

Thanks for all the help guys, any other suggestions for speed?
 
Use the sieve of Atkins, it is one.of the fastest and runs in sub linear time.
http://en.m.wikipedia.org/wiki/Sieve_of_Atkin

Next, change languages python is terrible for this kind of thing and is best left for scripting and prototyping. A good c implementation is likely 10-20 times faster.

Then do some profiling and see if there is a way to optimize from there. There are lots of small tricks, e.g when squaring it is better to multiply than use a pow function.

But really, what do you want to achive?
I would concentrate on better coding practices, e.g, there are better ways of doing file IO in python (with open)
 
Last edited:
Tryzan, when posting code please use [CODE] tags. It makes it much more readable and you don't need to create "fake tabs" for indentation.

E.g. the code from you last post:

Code:
def eratosthenes(limit):
    valuesList = [False] * 2 + [True] * (limit-1)
    for x in xrange(int(limit**0.5 + 1)):
        if valuesList[x] == True:
            for x in range(x**2, limit + 1, x):
                valuesList[y] = False
    for z in xrange(limit + 1): #Writes to the file at end of program
        if valuesList[z] == True:
            primesFile.write("%s \n" % str(z))
 
Next, change languages python is terrible for this kind of thing and is best left for scripting and prototyping. A good c implementation is likely 10-20 times faster.
Python is the only language I know, apart from HTML and CSS, and I don't have the time right now to learn a new language :(

I would concentrate on better coding practices, e.g, there are better ways of doing file IO in python (with open)
Do you know of somewhere where the best coding practices are laid out?

Tryzan, when posting code please use [CODE] tags.
Thank you, I was unaware that this was available :)
 
Python is fun and easy but it is not at all fast, so once you have picked the best algorithm I would forget about any performance really. You could still look at a few micro-optimization but these are trickier with python because A), no one who codes python really cares, B) the python interpreter may do a lot of magic.
E.g. y= x*x may be faster than y=x**2. The latter is a power op and in most languages these are slow (in C/C++ pow(x,2) is much slower than x*x ). But I have no idea if the python interpreter will encode **2 as a simple mul operator.



I don't know python well enough to know good resources but the python help pages themselves tend to tell you about good practices.
 
I agree, python I used mainly by beginners who do not need speed for their simple programs and therefore don't worry about it. I will try x*x instead of x**2 and edit this post with results.
EDIT:
Using x**2 for an input of 1000000 took an average of 11.957s
Using x*x for an input of 1000000 took an average of 10.385s

I will look at the help pages, thanks for putting me in the right direction.
 
Last edited:
You may be interested

Code:
import bitarray
def sieve(n):
    """bitarray sieve"""
    b=bitarray.bitarray([1]*(n+1))
    x=(int((n**.5))+1)
    for k in range(2,x):
        if b[k]:
            f=k+1 #thread
            e=(n//k)+1 #thread
            for g in range(f,e):
                if b[g]:
                    b[(g*k)]=0
    b=handle_squares(b)
    return b
                    
    
def handle_squares(ba):
    for s in range(2,int(len(ba)**.5)):
        if ba[s]:
            n=1
            sn=s**2
            while sn<len(ba):
                ba[sn]=0
                n+=1
                sn=s**(2*n)
    return ba

Tryzan50, I was somewhat obsessed with this these types of sieves for a number of months. The above is the fastest I could come up with. Your sieve is faster than this. Kudos and thank you. Nevertheless I found this pattern interesting.

-Hope you like it
 
Back
Top Bottom