Applications who want to share a volume can use below storageclass
to make their volumes shared by multiple pods
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: openebs-zfspv
parameters:
shared: "yes"
fstype: "zfs"
poolname: "zfspv-pool"
provisioner: zfs.csi.openebs.io
```
Now the provisioned volume using this storageclass can be used by multiple pods.
Here pods have to make sure of the data consistency and have to have locking mechanism.
One thing to note here is pods will be scheduled to the node where volume is present
so that all the pods can use the same volume as they can access it locally only.
This was we can avoid the NFS overhead and can get the optimal performance also.
Also fixed the log formatting in the GRPC log.
Signed-off-by: Pawan <pawan@mayadata.io>
We can not mount the datasets to more than one path via zfs mount command,
shifting to the legacy way of handling ZFS volumes where we can mount/umount
the datasets via legacy mount and umount commands.
This will also add a building block for SINGLE-NODE-MULTI-WRITER Capability.
Signed-off-by: Pawan <pawan@mayadata.io>
PVC will not bound if there are wrong parameters/poolname in the storageclass,
the ZFSVolume CR will be still created and will remain in Pending State,
deletion of the PVC will delete PVC and since PVC is not bound, ZFS-LocalPV
driver will not get the delete call and will leave the ZFSVolume CR hanging there.
Reverting the behavior introduced in https://github.com/openebs/zfs-localpv/pull/121,
Now PVC will be bound but still ZFSVolume will be in Pending state until the volume is created.
Signed-off-by: Pawan <pawan@mayadata.io>
More specifically,
- introduce helper function to get maps with all keys set to lowercase,
- introduce lookup helper based on such maps and
- change lookups for CreateVolumeRequest()s and CreateVolume()s so that
parameter keys are processed as lowercase irrespective of actual
spelling.
Signed-off-by: Christopher J. Ruwe <cjr@cruwe.de>
Readonly flag does not come as mount option, it has
separate field to mention readonly flag. ZFS-LocalPV
driver should check that field and add "ro" as mountoption.
Signed-off-by: Pawan <pawan@mayadata.io>
The controller does not check whether the volume has been created or not
and return successful. Which in turn binds the pvc to the pv.
The PVC should not bound until corresponding zfs volume has been created.
Now controller will check the ZFSVolume CR state to be "Ready" before returning
successful. The CSI will retry the CreateVolume request when it will get
a error reply and when the ZFS node agent creates the ZFS volume and sets the
ZFSVolume CR state to be "Ready", the controller will return success for the
CreateVolume Request and then PVC will be bound.
Signed-off-by: Pawan <pawan@mayadata.io>
adding grafana dashboard for ZFS Local PV that shows the following metrics:
- Volume Capacity (used space percentage)
- ARC Size, Hits, Misses
- L2ARC Size, Hits, Misses
- ZPOOL Read/Write IOs
- ZPOOL Read/Write time
This dashboard was inspired by https://grafana.com/grafana/dashboards/7845
Signed-off-by: Pawan <pawan@mayadata.io>
if there are no changes then `git describe --tags `git rev-list --tags --max-count=1`
may return older tag as there will be two tags referring to the same commit.
Using travis tag here to clearly differentiate the versions.
Signed-off-by: Pawan <pawan@mayadata.io>
This commit adds the support for creating a Raw Block Volume request using volumemode as block in PVC :-
```
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: block-claim
spec:
volumeMode: Block
storageClassName: zfspv-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
```
The driver will create a zvol for this volume and bind mount the block device at the given path.
Signed-off-by: Pawan <pawan@mayadata.io>
There are setups where nodename is different than the hostname.
The driver uses the nodename and tries to set the "kubernetes.io/hostname"
node label to the nodename. Which will fail if nodename is not same as
hostname. Here, changing the key to unique name so that the driver can set
that key as node label and also it can not modify/touch the existing node labels.
Now onwards, the driver will use "openebs.io/nodename" key to set the PV node affinity.
Old volumes will have "kubernetes.io/hostname" affinity, and they will also work as
after the PR https://github.com/openebs/zfs-localpv/pull/94, it supports all the node
labels as topology key and all the nodes have "kubernetes.io/hostname" label set. So
old volumes will work without any issue. Also for the same reason old stoarge classes
which are using "kubernetes.io/hostname" as topology key, will work as that key is supported.
This fixes the issue where the driver was trying to create the PV on the master node
as master node is having "kubernetes.io/hostname" label, so it is also becoming a valid
candidate for provisioning the PV. After changing the key to unique name, since the driver
will not run on master node, so it will not set "openebs.io/nodename" label to this node
hence this node will never become a valid candidate for the provisioning the volume.
Signed-off-by: Pawan <pawan@mayadata.io>
This commit adds the support for use to specify custom labels to the kubernetes nodes and use them in the allowedToplogoies section of the StorageClass.
Few notes:
- This PR depends on the CSI driver's capability to support custom topology keys.
- label on the nodes should be added first and then deploy the driver to make it aware of
all the labels that node has. If labels are added after ZFS-LocalPV driver
has been deployed, a restart all the node csi driver agents is required so that the driver
can pick the labels and add them as supported topology keys.
- if storageclass is using Immediate binding mode and topology key is not mentioned
then all the nodes should be labeled using same key, that means:
- same key should be present on all nodes, nodes can have different values for those keys.
- If nodes are labeled with different keys i.e. some nodes are having different keys, then ZFSPV's default scheduler can not effictively do the volume count based scheduling. In this case the CSI provisioner will pick keys from any random node and then prepare the preferred topology list using the nodes which has those keys defined. And ZFSPV scheduler will schedule the PV among those nodes only.
Signed-off-by: Pawan <pawan@mayadata.io>
k8s is very slow in attaching the volumes when dealing with the
large number of volume attachment object.
(k8s issue https://github.com/kubernetes/kubernetes/issues/84169)
The volumeattachment is not required for ZFSPV, so avoid creation
of attachment object, also removed the csi-attacher container as
this is also not needed as it acts on volumeattachment object.
k8s is very slow in attaching the volumes when dealing with the
large number of volume attachment object :
k8s issue https://github.com/kubernetes/kubernetes/issues/84169).
Volumeattachment is a CR created just to tell the watcher of it
which is csi-attacher, that it has to call the Controller Publish/Unpublish grpc.
Which does all the tasks to attach the volumes to a node for example call to the
DigitalOcean Block Storage API service to attach a created volume to a specified node.
Since for ZFSPV, volume is already present locally, nothing needs to done in Controller
Publish/Unpublish, so it is good to remove them.
so avoiding creation of attachment object in this change, also removed the csi-attacher
container as this is also not needed as it acts on volumeattachment object.
Removed csi-cluster-driver-registrar container also as it is deprecated and not needed anymore.
We are using csidriver beta CRDs so minimum k8s version required is 1.14+.
Signed-off-by: Pawan <pawan@mayadata.io>
looks like a bug in ZFS as when you change the mountpoint property to none,
ZFS automatically umounts the file system. When we delete the pod, we get the
unmount request for the old pod and mount request for the new pod. Unmount
is done by the driver by setting mountpoint to none and the driver assumes that
unmount has done and proceeded to delete the mountpath, but here zfs has not unmounted
the dataset
```
$ sudo zfs get all zfspv-pool/pvc-3fe69b0e-9f91-4c6e-8e5c-eb4218468765 | grep mount
zfspv-pool/pvc-3fe69b0e-9f91-4c6e-8e5c-eb4218468765 mounted yes -
zfspv-pool/pvc-3fe69b0e-9f91-4c6e-8e5c-eb4218468765 mountpoint none local
zfspv-pool/pvc-3fe69b0e-9f91-4c6e-8e5c-eb4218468765 canmount on
```
here, the driver will assume that dataset has been unmouted and proceed to delete the
mountpath and it will delete the data as part of cleaning up for the NodeUnPublish request.
Shifting to use zfs umount instead of doing zfs set mountpoint=none for umounting the dataset.
Also the driver is using os.RemoveAll which is very risky as it will clean
child also, since the mountpoint is not supposed to have anything,
just os.Remove is sufficient and it will fail if there is anything there.
Signed-off-by: Pawan <pawan@mayadata.io>
Validating few parameters for the ZFSVolume custom resource
- compression can be "on", "off", "lzjb", "gzip", "gzip-[1-9]", "zle" and "lz4"
- encryption can be "on", "off", "aes-128-ccm", "aes-192-ccm", "aes-256-ccm", "aes-128-gcm", "aes-192-gcm", and "aes-256-gcm"
- dedup can be "on" and "off"
- poolname can be string
- ownernodeid can be string
- thinprovision can be "yes" and "no"
- volumetype can be "DATASET" and "ZVOL"
Also added required fields needed to create ZFSVolume CR
- ownerNodeID
- poolname
- volumeType
- capacity
Signed-off-by: Pawan <pawan@mayadata.io>