c++ : work around for Vector of references

Caporegime
Joined
18 Oct 2002
Posts
32,629
Just interested in how some more seasoned C++ people solve this kind of problem as now I am doing a little more production related coding than prototyping.

Let say you have a class that has say a unordered_map (hashmap) that holds a lot of data, say 500Mb. You want to write an accessor that returns some subset of that data in an efficient manner.


Take the following, where BigData is some class that stores a moderate amount of data.

Code:
Class A
{
   private:
      unordered_map<string, BigData> m_map;   // lots of data

   public:
     
    vector<BigData>   get10BestItems()
    {
        vector<BigData>  results;
        for_each ........  // iterate over m_map and add 10 best items to results
        // ... 
       return results;
    }; 

};

The accessor get10BestItems is not very efficient in this code because it first copies the items to the results vector, then the results vector is copied when the function is returned (copying from the function stack).


You can't have a vector of references in c__ for various reasons, which would be the obvious answer:
Code:
   vector<BigData&> results;     // vector can't contain references.


You could create the results vector on the heap and pass a reference to that:
Code:
vector<BigData>&   get10BestItems()    // returns a reference to the vector
    {
        vector<BigData>  results = new vector<BigData>;   // generate on heap
        for_each ........  // iterate over m_map and add 10 best items to results
        // ... 
       return results;   // can return the reference 
    };


But then you are going to run into memory leak issues if you are not careful. It is also slow (heap memory) and still copies data from the map to the vector.



So we can look back at c-style coding and just use pointers:
Code:
vector<BigData*>   get10BestItems()    // returns a vector of pointers
    {
        vector<BigData*>  results ; // vectors of pointers
        for_each ........  // iterate over m_map and add 10 best items to results
        // ... 
       return results;  
    };

But most sources say to not use pointers unless absolutely necessary. There are options to use smart_pointers and the boost ptr_vector but I rather try to avoid these if possible.

I do no that the map is going to be static so I am not too worried about bad pointers. Just one issue if the code will have to be difference to handle pointers. Stylistically this is not pleasant:

Code:
const BigData&   getTheBestItem()    // returns a const reference
{
       string bestID;
       for_each ........  // iterate over m_map, find bestID
       // ... 
       return m_map[bestID] ; // return a referencr to the best item
}


vector<BigData*>   get10BestItems()    // returns a vector of pointers
{
        vector<BigData*>  results ; // vectors of pointers
        for_each ........  // iterate over m_map and add 10 best items to results
        // ... 
       return results;  
 };

E.g., if you want a single item then it is easy to return a reference.



Finally option is to simply make the Hash-map public and return a vector of keys (in this case strings):
Code:
Class A
{
      public:
      
         unordered_map<string, BigData> m_map;   // lots of data

   
     
    vector<string>   get10BestItemKeys()
    {
        vector<string>  results;
        for_each ........  // iterate over m_map and add 10 best KEYS to results
        // ... 
       return results;
    }; 

};


...
...
A aTest;
... // load data to map
...
vector <string> best10 =  aTest.get10BestItemKeys();

for_each .... // iterate over all KEYs in best10
{
    aTest.m_map.find(KEY);  // do something with item.
}



What do you guys do?
I have generally created code where I was the consumer and a lot of it was
R&D work. I need to make some code that will be used across a larger team and will go into production. It needs to be fast, but maintain good software standards that are not likely to cause mistakes for future coders on the project.
 
Thanks for the suggestion.

At the moment I return a vector of regular pointers which can be used if performance becomes critical, plus a vector of keys with a accessor for a single element which is cleaner but slightly slower.

The main map container is basically static so the pointers should be safe, and of course I are them cont safe.
 
Back
Top Bottom