C# help please

Associate
Joined
2 Oct 2003
Posts
2,121
Location
Chester
Hello!

Quite new to C# but gave myself the challenge of writing a promotion code generator like the ones you get on the Internet.

The results have to be unique and I want to generate between 10,000 and 500,000 a time. The class I have wrote seems to be rather slow and I am wondering if there is a faster way of doing this?

PHP:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace WindowsFormsApplication1
{
    class PromotionCodeGenerator
    {
        StreamWriter sw = new StreamWriter(@"c:\codes.csv");

        /// <summary>
        /// Generate a list of codes.
        /// </summary>
        public void GenerateCodes(int ID, int Qty)
        {
            HashSet<string> Codes = new HashSet<string>();

            for (int i = 0; i < Qty; i++)
            {
                string Code = GenerateCode(ID);
                
                if (Codes.Contains(Code))
                {
                    i--;
                }
                else
                {
                    Codes.Add(Code);
                    WriteToFile(Code);
                }
            }  
        }

        /// <summary>
        /// Generate a new promotion code.
        /// </summary>
        private string GenerateCode(int ID)
        {
            Random r = new Random();      

            return Convert.ToString(String.Concat(ID,"-", r.Next(10000000, 99999999)));
        }

        /// <summary>
        /// Write promotion code to file.
        /// </summary>
        private void WriteToFile(string Code)
        {
            sw.WriteLine(Code);
        }
    }
}
 
A few observations:

  • You're accessing the file after every number you generate, which isn't going to be fast. You'd be better off generating everything in memory first and then writing it to a file in one go at the end.
    This will likely avoid the need to have a StreamWriter as a class member. StreamWriter implements IDisposable and if you use it in just one place you can wrap it in a using statement. If you have it as a class member you should really implement IDisposable in your class to clean up resources when you are finished.
  • You shouldn't create a new Random object each time. Have one as a class member and use then. Calling the Next method on it is sufficient to generate a new random number.
  • HashSet.Add actually returns a boolean to say whether the member could be added, so you don't need the check for Contains.
    Thinking about it, a HashSet may not be the best data structure for this. If you use a Dictionary it may be more efficient to check whether an item is already in the collection. It may well not be the case, but could be something worth considering.

EDIT: I should point out, if you do use a Dictionary, use this overload in the constructor:
http://msdn.microsoft.com/en-us/library/tk84bxf4.aspx

This enables you to size it to the number of items you know you want in it.
 
Last edited:
[*]HashSet.Add actually returns a boolean to say whether the member could be added, so you don't need the check for Contains.
Thinking about it, a HashSet may not be the best data structure for this. If you use a Dictionary it may be more efficient to check whether an item is already in the collection. It may well not be the case, but could be something worth considering.

Just another suggestion, you can offload a lot of the workload in ensuring unique entries onto a database - which will be fairly optimised at this kind of task. You should get fairly decent write rates to the database if you LOCK the TABLE FOR WRITE prior to the run. Presumably you'd be writing to a CSV file anyway which would likely end up in a DB.
 
Just another suggestion, you can offload a lot of the workload in ensuring unique entries onto a database - which will be fairly optimised at this kind of task. You should get fairly decent write rates to the database if you LOCK the TABLE FOR WRITE prior to the run. Presumably you'd be writing to a CSV file anyway which would likely end up in a DB.

Checking whether a key exists in a Dictionary (which is pretty much a Hashtable under the surface) is an O(1) operation.
If you go to a database it will take hugely longer than just doing it in memory on the machine doing the work.
 
Because I'm bored at work on a Friday afternoon, here's a version that runs a little quicker, even using the StreamWriter as is. That bit doesn't actually take all that long when I looked at it!

Generates 500K codes in under 1 second on my machine.

Code:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            var sw = new Stopwatch();
            var pc = new PromotionCodeGenerator();

            sw.Start();
            pc.GenerateCodes(10, 500000);

            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
            Console.ReadKey();
        }
    }

    class PromotionCodeGenerator
    {
        readonly Random r = new Random();
        
        /// <summary>
        /// Generate a list of codes.
        /// </summary>
        public void GenerateCodes(int id, int qty)
        {
            using (var sw = new StreamWriter(@"C:\codes.csv"))
            {
                int codesGenerated = 0;
                var codes = new HashSet<string>();
                while (codesGenerated < qty)
                {
                    var code = GenerateCode(id);
                    if (codes.Add(code))
                    {
                        codesGenerated++;
                        sw.WriteLine(code);
                    }
                }
            }
        }

        /// <summary>
        /// Generate a new promotion code.
        /// </summary>
        private string GenerateCode(int ID)
        {
            return String.Concat(ID,"-", r.Next(10000000, 99999999));
        }
    }
}

EDIT: Just checked using a HashSet, looks like it is slightly quicker than a Dictionary on my machine so you'd probably be better off sticking with that.
 
Last edited:
Checking whether a key exists in a Dictionary (which is pretty much a Hashtable under the surface) is an O(1) operation.
If you go to a database it will take hugely longer than just doing it in memory on the machine doing the work.

It's dependent on whether it's a practical problem or a Comp Sci exercise and what the timings are, for a practical exercise you should be able to pump say 100k records a second into a DB.
 
It's dependent on whether it's a practical problem or a Comp Sci exercise and what the timings are, for a practical exercise you should be able to pump say 100k records a second into a DB.

Whatever the situation is then doing the dupe checking in a database is wrong. If it's a comp sci exercise then it will probably be about using the right data structure for the job.
If practical then the database will likely be on a separate machine and it will cost a whole lot more to marshal the data across the network than to check whether some data structure already contains the key you're trying to insert locally.
 
If you haven't got it already, this is one way.
Code:
        public string GenerateCode(int ID)
        {
            string rStr = System.IO.Path.GetRandomFileName().Replace(".", "");
            return string.Concat(ID, "-", r.Next(10000000, 99999999),rStr);
        }

rStr contains numbers though, so it's not just letters. There is probably a better way to do it, but that's one of the quickest if you want some random letters. It takes about an extra 6 seconds though.
 
Last edited:
An easy way to generate letters is to simply get an integer within the range of a subset of the ASCII character set you want and then cast to a char.
A nice way to do this in C# is to use a feature called Extension Methods (http://msdn.microsoft.com/en-us/library/bb383977.aspx), which effectively add on to the API of the class you're using.
So, I can create a class that adds extra bits onto the Random interface in the following way:

Code:
   public static class RandomExtensions
    {
        // Note: The upper limits are exclusive, hence set to one more
        // than the actual highest ASCII value we want
        private const byte AsciiUpperCaseLetterLowerLimit = 65;
        private const byte AsciiUpperCaseLetterUpperLimit = 91;
        private const byte AsciiLowerCaseLetterLowerLimit = 97;
        private const byte AsciiLowerCaseLetterUpperLimit = 123;

        public static string NextString(this Random random, int length)
        {
            var sb = new StringBuilder(length);

            for(int i = 0; i < length; i++)
            {
                sb.Append(random.NextChar());
            }

            return sb.ToString();
        }

        public static char NextUpperCaseChar(this Random random)
        {
            return GetAsciiChar(random, AsciiUpperCaseLetterLowerLimit, AsciiUpperCaseLetterUpperLimit);
        }

        public static char NextLowerCaseChar(this Random random)
        {
            return GetAsciiChar(random, AsciiLowerCaseLetterLowerLimit, AsciiLowerCaseLetterUpperLimit);
        }

        public static char NextChar(this Random random)
        {
            if (random.NextDouble() > 0.5)
            {
                return random.NextUpperCaseChar();
            }

            return random.NextLowerCaseChar();
        }

        private static char GetAsciiChar(this Random random, byte lowerLimit, byte upperLimit)
        {
            return (char)random.Next(lowerLimit, upperLimit);
        }
    }

If you paste that into your code, you should see a few extra methods on Random that you can use to generate chars and string comprising solely of letters.
I'm sure you can modify things to suit your needs if this isn't exactly what you want.

Now, to generate an 8-digit string in the same way as the numbers previously you would just do:
Code:
return String.Concat(ID, "-", r.NextString(8));
 
Back
Top Bottom