Kubernetes Persistent Volumes

thenewoc · 13 Aug 2023 at 15:19

I'm currently teaching myself Kubernetes and have come across the following issue which I'm not sure is a bug or a feature. :rolleyes:

I've created a 'local-storage' storageClass with a volumeBindingMode = WaitForFirstConsumer, a Persistent Volume with a spec.capacity.storage = 5Gi, a Persistent Volume Claim with a spec.resources.requests.storage = 100Mi. These all seem to apply fine into my minikube cluster but when I apply the deployment that references this Persistent Volume Claim instead of it claiming 100Mi from the Persistent Volume, it takes the whole 5Gi.

Anyone here experienced this behaviour to comment?

opethdisciple · 13 Aug 2023 at 17:25

I believe this is expected behaviour.

There is a 1 to 1 relationship between the PV and the PVC.

I dont think you have have multiople PVC's using the same PV.

Lets see if someone else agrees with me.

thenewoc · 13 Aug 2023 at 18:35

opethdisciple said:
I believe this is expected behaviour.

There is a 1 to 1 relationship between the PV and the PVC.

I dont think you have have multiople PVC's using the same PV.

Lets see if someone else agrees with me.

Thanks, would be interesting to know whether this is across the board and not just a limitation unique to hostpath and local.

It does also rather beg the question why you're able to specify a 'subPath' under spec.volumeMounts in a deployment though.

opethdisciple · 13 Aug 2023 at 19:22

I suppose as a test, you can provision another physical volume but this time of a smaller capacity (say 300Mb) and see which one it chooses for the pvc.

My guess is it will use the smaller one.

thenewoc · 14 Aug 2023 at 18:15

opethdisciple said:
I suppose as a test, you can provision another physical volume but this time of a smaller capacity (say 300Mb) and see which one it chooses for the pvc.

My guess is it will use the smaller one.

Yes it did choose the smaller one then. I was just a bit surprised that the mechanism was 1:1 since dynamic provisioning also being a thing, but only it seems with some storage class provisioners.

blairw · 15 Aug 2023 at 16:23

I'd look at dynamic provisioning here - https://kubernetes.io/docs/concepts/storage/persistent-volumes/

dreamcat4 · 16 Aug 2023 at 21:14

sorry for interjecting here. but can i take this opportunity to ask more generally about these kubes pv? would be very grateful thanks. (and not to disrupt or interfere with your existing conversation about it). if this question is not relevant / not actually matters to that...

just would be nice as an onlooker to get a better understanding of these persistent volumes official kubes api feature. for how it is now / where it's gotten to / where it can now fit into a broader landscape of different types or common categories of persistent online storage. for cloud services.

for example if the current version of the pv api lets you select and define appropriate backends for different classes / categories / performance / characteristics etc. as for what you specify that purpose for the pv to be used for.

since as a person who is not yet a kubernetes user (it's pretty scary! a daunting amount of stuff to learn)... i don't have the necessary time yet to research properly. but i suppose i can simply cite a few examples of different data storage backends. to highlight some different known categories (although some of these are similar to each other):

* fast local storage backend for a traditional relational database (with replication etc handled by the database, for example maybe postres) --> fast performance and coherency. for the rest of the app to then connect over regular http/tls postgre (or mysql) tcp connection to interact with the hopefully fast (enough) traditional db.
* large bulk cluster distributed storage (systems like ceph, that require several nodes to work) --> for large amounts of mass data served across many nodes. --> but perhaps a bit slower than 1)
* regular cloud wan storage over cloud provider apis, such as amazon s3 buckets api. --> useful for storing smaller amounts of data, for example settings. or other small data. or for larger amounts of data. but with a lower performance or longer service times than 2) since its over wan cloud apis
* other cloud provider specific apis for doing 1) and 2) which weren't already covered by 3)
* other locally replicated pv volumes where the data is distributed along with the same container machine nodes, for example like the vmware tiered approach (which is proprietary to vmware)
* many other ones... i guess some is specific kubes umbrella projects for supplying better distributed pvs. and other kubes vendors giving a paid service(s)

so presumably either the kubes pv api currently covers just the basic use case of 5) only. or otherwise it can also be more flexible and expressive for selecting appropriate data storage backend(s) to link to right? depending on the performance requirements of the specific volume being configured and setup for some specific container or app deployment.... depending for what it's intended use will be for? + also what selection of available backend you are choosing to be setting up and providing as possible options?

for example: if the deployment is to azure. they will have their own service offerings. versus civo? or google, aws etc. or self hosted (on prem cluster). or special kubes vendor etc.

so i suppose the main thrust of my question is: can everything be done thru modern kubes pv api now? (that was not before). or do we still need to rely also upon other apii / 3rd party kubes solution that hasn't been fully / officially integrated yet. to get access to all of the full range of useful stuff.

i suppose it does not matter unless is actually needed. the main purpose being an applications backend / database storage and replication. as per traditionally handled by 1) i.e. some postgres db (or equivalents / alternatives). since most of the rest is indeed supposed to be idempotent / read only push pipelines and baked into iteratively updating container image deploymants. for example all of the static assets etc. to get updated.