Usecase: A node in the Kubernetes cluster is replaced with a new node. The
new node gets a different `kubernetes.io/hostname`. The storage devices
that were attached to the old node are re-attached to the new node.
Fix: Instead of using the default `kubenetes.io/hostname` as the node affinity
label, this commit changes to use `openebs.io/nodeid`. The ZFS LocalPV driver
will pick the value from the nodes and set the affinity.
Once the old node is removed from the cluster, the K8s scheduler will continue
to schedule applications on the old node only.
User can now modify the value of `openebs.io/nodeid` on the new node to the same
value that was available on the old node. This will make sure the pods/volumes are
scheduled to the node now.
Note: Now to migrate the PV to the other node, we have to move the disks to the other node
and remove the old node from the cluster and set the same label on the new node using
the same key, which will let k8s scheduler to schedule the pods to that node.
Other updates:
* adding faq doc
* renaming the config variable to nodename
Signed-off-by: Pawan <pawan@mayadata.io>
Co-authored-by: Akhil Mohan <akhilerm@gmail.com>
* Update docs/faq.md
Co-authored-by: Akhil Mohan <akhilerm@gmail.com>
Currently controller picks one node and the node agent keeps on trying to
create the volume on that node. There might not be enough space available
on that node to create the volume.
The controller can try on all the nodes sequentially and fail
the request if volume creation fails on all the nodes which satisfies the
topology contraints.
Signed-off-by: Pawan <pawan@mayadata.io>
Encrypted pool does not allow the volume to be pre created for the
restore purpose. Here changing the design to do the restore first
and then create the ZFSVolume object which will bind the volume
already created while doing restore.
Signed-off-by: Pawan <pawan@mayadata.io>
The ZFS Driver will use capacity scheduler to pick a node
which has less capacity occupied by the volumes. Making this
as default scheduler as it is better than the volume count based
scheduling. We can use below storageclass to specify the scheduler
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: openebs-zfspv
allowVolumeExpansion: true
parameters:
scheduler: "CapacityWeighted"
poolname: "zfspv-pool"
provisioner: zfs.csi.openebs.io
```
Please Note that after the upgrade, there will be a change in the behavior.
If we are not using `scheduler` parameter in the storage class then after
the upgrade ZFS Driver will pick the node bases on volume capacity weight
instead of the count.
Signed-off-by: Pawan <pawan@mayadata.io>
This PR adds the capability to create the Clone from pvc directly
```
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-clone
spec:
storageClassName: openebs-snap
dataSource:
name: pvc-snap
kind: PersistentVolumeClaim
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4Gi
```
The ZFS_LocalPV driver will create one internal snapshot of the name
same as the new volume name and will create a clone out of it. Also,
while destroying the volume the driver will take care of deleting
the created snapshot for the clone.
Signed-off-by: Pawan <pawan@mayadata.io>
For ZFSPV, all the node daemonset pods can go into the terminating state at
the same time since it does not need any minimum availability of those pods.
Changing maxUnavailable to 100% so that K8s can upgrade all the daemonset
pods parallelly.
Signed-off-by: Pawan <pawan@mayadata.io>
Added a schema validation for backup and restore CR. Also validating
the server address in the backup/restore controller.
Validating the server address as :
^([0-9]+.[0-9]+.[0-9]+.[0-9]+:[0-9]+)$
which is :
<any number>.<any number>.<any number>.<any number>:<any number>
Here we are validating just the format of the IP, not validating that IP should be
correct which will be little more complex. In any case if IP is not correct,
the zfs send will fail, so no need to do complex validation to validate the
correct IP and port.
Signed-off-by: Pawan <pawan@mayadata.io>
Now we have the same operator yaml which can work for all
OS distro. We don't need to have OS specific Operator yamls.
Signed-off-by: Pawan <pawan@mayadata.io>
This commit adds support for Backup and Restore controller, which will be watching for
the events. The velero plugin will create a Backup CR to create a backup
with the remote location information, the controller will send the data
to that remote location.
In the same way, the velero plugin will create a Restore CR to restore the
volume from the the remote location and the restore controller will restore
the data.
Steps to use velero plugin for ZFS-LocalPV are :
1. install velero
2. add openebs plugin
velero plugin add openebs/velero-plugin:latest
3. Create the volumesnapshot location :
for full backup :-
```yaml
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
name: default
namespace: velero
spec:
provider: openebs.io/zfspv-blockstore
config:
bucket: velero
prefix: zfs
namespace: openebs
provider: aws
region: minio
s3ForcePathStyle: "true"
s3Url: http://minio.velero.svc:9000
```
for incremental backup :-
```yaml
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
name: default
namespace: velero
spec:
provider: openebs.io/zfspv-blockstore
config:
bucket: velero
prefix: zfs
backup: incremental
namespace: openebs
provider: aws
region: minio
s3ForcePathStyle: "true"
s3Url: http://minio.velero.svc:9000
```
4. Create backup
velero backup create my-backup --snapshot-volumes --include-namespaces=velero-ns --volume-snapshot-locations=aws-cloud-default --storage-location=default
5. Create Schedule
velero create schedule newschedule --schedule="*/1 * * * *" --snapshot-volumes --include-namespaces=velero-ns --volume-snapshot-locations=aws-local-default --storage-location=default
6. Restore from backup
velero restore create --from-backup my-backup --restore-volumes=true --namespace-mappings velero-ns:ns1
Signed-off-by: Pawan <pawan@mayadata.io>
* feat(zfspv): mounting the root filesystem to remove the dependency on the OS
We are mounting the individual library to run the zfs
binary inside the ZFS-LocalPV daemonset. The problem with this
is each OS has different sets of libraries. We need to have different
Operator yamls for different OS versions.
Here we are mounting the root directory inside the ZFS-LocalPV daemonset Pod
which does chroot to this path and run the command. As all the libraries will
be available which are present on the host inside the Pod, so we don't need to mount each
library here and also it will work for all the Operating systems.
To be on the safe side, we are mounting the host's root directory
as Readonly filesystem.
Signed-off-by: Pawan <pawan@mayadata.io>
* adding comment for namespace
Signed-off-by: Pawan <pawan@mayadata.io>
CVE-2020-16845 has been reported for go versions earlier to 1.14.7,
this PR upgrades the go version for travis builds.
Signed-off-by: Pawan <pawan@mayadata.io>
few customers are using old version of the driver where
Status field is not present. So mount will fail after the
upgrade to the 0.9 or later version.
Reverting back to the checking if finalizer is set to check if
volume is ready to be mounted.
Signed-off-by: Pawan <pawan@mayadata.io>
ZFS does not create the zvol if volume size is not multiple of
the volblocksize. There are use cases where customer will create
a PVC with size as 5G, which will be 5 * 1000 * 1000 * 1000 bytes
and this is not the multiple of default volblocksize 8k.
In ZFS, volblocksize and recordsize must be power of 2 from 512B to 1M,
so keeping the size in the form of Gi or Mi should be
sufficient to make volsize multiple of volblocksize/recordsize.
Signed-off-by: Pawan <pawan@mayadata.io>
This field was added in Kubernetes 1.16 and it informs Kubernetes about
the volume modes that are supported by the driver. The default is
"Persistent" if it is not used.
This operator yaml will not work on k8s 1.14 and 1.15, since the driver supports
those k8s version so no need to mention volumeLifecycleModes in the operator as
the default is "Persistent".
Signed-off-by: Pawan <pawan@mayadata.io>
This issue is specific to xfs only, when we create a clone volume and system is taking time in creating the device.
When we create a clone volume from a xfs filesystem, ZFS-LocalPV will go ahead and generate a new UUID for the clone volumes as we need a new UUID to mount the new clone filesystem. To generate a new UUID for the clone volume, ZFS-LocalPV first replays the xfs log by mounting the device to a tmp localtion.
Here, what is happening is since device creation is slow, so we went ahead and created the tmp location to mount the clone volume but since device has not created yet, the mount failed. In the next try since the tmp location is present, it will keep failing there only at every reconciliation time.
Signed-off-by: Pawan <pawan@mayadata.io>
Instead of checking for the finalizer, checking for the
volume state to be ready is more intuitive before mounting it.
Also removed duplicate if statement for btrfs which was added while resolveing
the merge conflict in https://github.com/openebs/zfs-localpv/pull/175.
Signed-off-by: Pawan <pawan@mayadata.io>
We changed the ubuntu docker image to 20.04 in https://github.com/openebs/zfs-localpv/pull/170,
which has issues with formatting the zvol as xfs file system. The filesystem
formatted with xfs is not able to mount with error : "missing codepage or helper program, or other error".
Reverting back to ubuntu 19.10 to fix this issue.
Signed-off-by: Pawan <pawan@mayadata.io>
added snapshot and clone related test cases. Also restructure
the BDD framework to loop through the supported fstypes and perfrom all
the test cases we have.
Signed-off-by: Pawan <pawan@mayadata.io>
btrfs, like xfs, needs to generate a new UUID for the
cloned volumes. All the devices with the same UUID will be treated
same for btrfs, so here generating the new UUID for the cloned volumes
using btrfstune command.
Signed-off-by: Pawan <pawan@mayadata.io>
Now, applications can use the btrfs file system by mentioning "btrfs"
as fstype in the storageclass :-
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: openebs-zfspv
parameters:
fstype: "btrfs"
poolname: "zfspv-pool"
provisioner: zfs.csi.openebs.io
Signed-off-by: Pawan <pawan@mayadata.io>
Applications who want to share a volume can use below storageclass
to make their volumes shared by multiple pods
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: openebs-zfspv
parameters:
shared: "yes"
fstype: "zfs"
poolname: "zfspv-pool"
provisioner: zfs.csi.openebs.io
```
Now the provisioned volume using this storageclass can be used by multiple pods.
Here pods have to make sure of the data consistency and have to have locking mechanism.
One thing to note here is pods will be scheduled to the node where volume is present
so that all the pods can use the same volume as they can access it locally only.
This was we can avoid the NFS overhead and can get the optimal performance also.
Also fixed the log formatting in the GRPC log.
Signed-off-by: Pawan <pawan@mayadata.io>
We can not mount the datasets to more than one path via zfs mount command,
shifting to the legacy way of handling ZFS volumes where we can mount/umount
the datasets via legacy mount and umount commands.
This will also add a building block for SINGLE-NODE-MULTI-WRITER Capability.
Signed-off-by: Pawan <pawan@mayadata.io>