Cache server/NAS for office

Soldato
Joined
27 Dec 2005
Posts
17,315
Location
Bristol
We're a team of 13 and I'm looking for a local cache server solution such that if Person A downloads a file, said file is also stored on a local NAS for say 30-60 days, such that if Person B tries to download the same file then they'll retrieve it from the local NAS rather than the web source. The main requirement is for this to happen for Google Drive files, accessed via the Google Drive for Desktop app on Windows, rather than all HTTP requests. In fact there's really no need for caching anything other than Google Drive.

Firstly, I don't even know if this exists or how complicated it is to setup. I heard Google offered their own solution for Drive, but I don't believe this exists anymore. I imagine we'd need a local cache of around 50TB, with potential to expand if needed, but no redundancy given the usage.

TIA.
 
Possibly QNAPs hybridmount may do what you want:

 
I am not aware of anything that can do this with Google Drive while maintaining permissions, because the NAS connects to Google using the permissions of whatever user you authenticate with. This may not be a problem depending on how important permissions are to you.
 
That's what I've just asked of QNAP @Caged. Each user has their own login which accesses a team-wide shared folder. Whilst 'permissions' per-se aren't that important - everyone has access to everything - it's a requirement of Cyber Essentials for obvious reasons and is used for change tracking and logging.
 
I could be wrong but QNAP's HybridMount seems similar to Synology's Cloud Sync, in that the NAS syncs between Google Drive (/cloud storage provider) and the NAS, and offers up the data locally (aka "cache") via a network share. Rather than a proxy cache (Squid etc), 'invisibly' caching network data.
It would mean local users would use the network share rather than Google Drive website/desktop app etc.
 
QNAP have come back to me and said:

Yes that’s right, when initially setup with a cloud account, no data is downloaded, just a file listing… when users access a file it becomes cached so that if they access it again whilst it is in the cache it will be as fast as local access.

You cannot set a time limit on the cache, it works on the basis of LRU, Least Recently Used. When the cache is full… the oldest item is simply changed back to a non-cached item seamlessly.

For multiple accounts you would have to add a separate HybridMount per account name. You get 2 free licenses with Hybrid Mount, if you wanted to do it for 15 users you would need to purchase more from our software store.

For clarity my questions were:

Can you confirm that HybridMount allows caching of accessed files/folders only, and not just local syncing of an entire Google Drive filesystem? Can this cache be set to have a time limit, ie after 60 days of no access requests the file is removed from the cache.

Can you also confirm that local caching is compatible with multiple Google Drive user accounts? ie joe.bloggs@ accesses a file on Google Drive and this is cached. Later brian.smith@ accesses the same file via their Google account; will the system recognise that the cached file is the same file?

So looks like it is possible?

I'm actually attending the Media Production & Technology Show tomorrow which they're exhibiting at so will have to speak to them.
 
Last edited:
Is like to know how the local cahe handles a file that user 1 is editing and user 2 then opens to edit also. Will it replicate multi user editing like the online platform will?
 
No, Google Drive isn't that sort of platform. There's not really much going on in the world of cloud file storage/sharing with local cache since the best solution in most cases is to just brute force it by buying a faster internet connection.
 
I did think myself, how much bandwidth does an office environment working on mostly documents use. It cant be that much that local caching makes a significant impact?

If responsiveness and availability of files is a priority wouldn't it just be better to store in a local environment and synchronise that to the cloud?
 
Last edited:
No, Google Drive isn't that sort of platform. There's not really much going on in the world of cloud file storage/sharing with local cache since the best solution in most cases is to just brute force it by buying a faster internet connection.

Which is what we currently do; we have a dedicated gigabit line and we're looking to upgrade to a 10 gig. This solution is to reduce that in order to increase upload/download time and reduce our CO2 impact.

I did think myself, how much bandwidth does an office environment working on mostly documents use. It cant be that much that local caching makes a significant impact?

If responsiveness and availability of files is a priority wouldn't it just be better to store in a local environment and synchronise that to the cloud?

We have about 250TB currently stored and will need to upload 1-2TB in bulk once or twice a week on average. I never said anything about mostly documents!

If you have a solution to your second line that meets Cyber Essentials requirements then I'm all ears; they need to be able to be able to access Google Drive via an individual login, and not be able to access anything via a shared login (if that's a requirement of NAS <-> cloud).
 
Last edited:
Which is what we currently do; we have a dedicated gigabit line and we're looking to upgrade to a 10 gig. This solution is to reduce that in order to increase upload/download time and reduce our CO2 impact.



We have about 250TB currently stored and will need to upload 1-2TB in bulk once or twice a week on average. I never said anything about mostly documents!

If you have a solution to your second line that meets Cyber Essentials requirements then I'm all ears; they need to be able to be able to access Google Drive via an individual login, and not be able to access anything via a shared login (if that's a requirement of NAS <-> cloud).

I am struggling to understand your requirements to be honest. It seems like at present you are using Google Drive as your primary storage.

If you used a local resource (for high availability and speed reasons) then the Cloud would become a replication/backup solution.

How are you CURRENTLY using Google Drive? I assume a Google Workspace environment and Google Drive Desktop? Why does this solution not work for caching as is?

250TB isnt that unmanageable and I assume that you are not frequently accessing a large % of that overall store. A caching solution will not do anything for your bulk upload requirements.

I think trying to implement a cache is likely added complexity and another point of failure. What is the average filesize and max single filesize you are dealing with? How available do these files need to be? IE - a handful are accessed throughout the course of the day or a large amount representing large data volumes at all times?

Knowung the industry and workflow might help. 250TB for 13 users is clearly not run of the mill stuff.

It might also be worth considering that unless you are on a end to end 10Gbit or more internal network your clients can probably get the data off the cloud to local drives just as quick as they could from an internal resource. Even with a 10GBit internal network you are going to need a pretty serious NAS/Server to provide file service to clients at high speed simultaneously.
 
Last edited:
Caching reduces the amount of bandwidth used and cloud read/write requests. That's it.

More data going back and forth over a network is going to be responsible for far fewer CO2 emissions than the embedded carbon in building all the components in a NAS, assembling them, shipping it and then powering it 24x7 in your office.

Google Drive is not designed to have a local cache, I cannot name a single third-party offering that will sync to Google Drive and then maintain permissions when serving LAN clients. There are solutions out there that can do what you want (Egnyte with Smart Cache, the cost of a quarter of a petabyte will bankrupt you) but if you aren't bandwidth constrained then solve this by throwing as fat a pipe as you can get in and ensure you're not blocking outbound QUIC connections, are using IPv6 etc. and let the Drive client handle it.
 
We're a team of 13 and I'm looking for a local cache server solution such that if Person A downloads a file, said file is also stored on a local NAS for say 30-60 days, such that if Person B tries to download the same file then they'll retrieve it from the local NAS rather than the web source.

Be very careful here. You want to make sure that the cache is consistent with the Google Drive, so that everyone gets the same file.
 
@Sin_Chase @Caged We're a video production company. Footage captured per filming day ranges from probably 200GB up to 2TB depending on the camera/format and type of shoot. Single files are anywhere from 200MB to 100GB, again depending on file format.

Current workflow is the crew backup all files to an offline archive, and then the editors capture the footage to Google Drive. If both a) the same editor works on the edit and b) they're not due to edit it immediately, then that's fine and there's little impact. But if that's not the case, or multiple editors are working on the project semi-simultaneously, then there's a lot of wasted bandwidth which can impact the rest of the office, or the editor ends up twiddling their thumbs waiting for the download to complete. Even with the dedicated gigabit line it's not always as fast as we'd want to due to contention, Drive throttling, other network traffic etc.

If it's impossible then that's that!
 
I'm pretty impressed that Google Drive is even working for your use case. Have you just grown into it because it's what you had?

Shared storage for video editing doesn't have much overlap with cloud - if you don't really need these files to be accessible outside of the office then you could look at a local NAS with backups pushed up to a cloud service. 250TB is a lot of data though and you're being spoiled by Google Shared Drives being unlimited.
 
Back
Top Bottom