When working in Windows environment, you are bound to a case where you need a file server. Especially if you are dealing with vendored apps where they require a file share to store data (for availability). This is when I curse Windows loudly, and lookup the pricing for file server solutions in AWS (where I spend most of my time these days...for now). Windows AD + storage + AWS = very expensive. The storage volume needed for the share doesn't have to be big, but the minimum you can build in AWS is 32 GiB with SSD and 2000 GiB with HDD. What if you need to store less than 1 GiB and not pay for the waste?
The answer is Linux and all of its apps. This is when I learned about GlusterFS and CTDB that can build a Samba cluster for Windows File share. I do want the cluster be highly available, for test purposes I went with 2 nodes (ideally would put them in 3 nodes in separate AZs).
Overview of the setup:
- 2 EC2 instances running Ubuntu with extra 1GiB EBS volume
- GlusterFS cluster
- CTDB + Samba
- no Active Directory (uses local user account)
This was good chance for me to revise Ansible and boy how I missed working with it. The community modules in Ansible are so great:
- Partition a disk? Has it.
- Format that disk and add it to /etc/fstab? Done.
- Create a GlusterFS volume? Pfft.
- Mount it do a path and add it to /etc/fstab? Thought we were done with this.
When using Ansible, one thing that slightly annoyed me was the inconsistency around inventory format. I had whipped up a Python script to pull the EC2 instances (and extract theirs vars from tags) and it took some time to figure out to properly output inventory (when the script was fed directly to ansible, it needed one structure; when I output script result to a static file and fed it to ansible, in yaml format, it had to be another structure). To be fair, ansible-inventory
made this process somewhat painless.
Let's see how this all happens. The source code has been pushed to GitLab, split into Terraform and Ansible repositories.
Standing up the AWS nodes with Terraform
We start with four nodes - two for serving the cluster and two for testing it (a Linux and a Windows bastion). The cluster nodes run in private subnets and bastions run in public ones. Using Terraform, I spin up the EC2 instances in two AZs, and tag the nodes accordinly which will come handy to build the Ansible inventory (this ensures that nodes can coexist with others based on tag values). Sample tag keys and values:
tags = {
Type = "node"
"ansible-var" = jsonencode({
name = each.key
master = each.key == "node1" ? 1 : 0
})
"ansible-group" = each.key == "node1" ? "master" : ""
}
I used ansible-var
and ansible-group
to embed values that will be read by the inventory script, and allows passing logic to Ansible. In above block, I'm making the first node the master, and declaring custom hostnames. These hostnames will be passed onto instances' /etc/hosts
file for cluster discovery.
After I ran the Terraform config and the python script to generate the Ansible inventory, I end up with an inventory as such:
# Generated by scripts/ec2.py
---
all:
children:
master:
hosts:
10.2.0.31: null
hosts:
10.2.0.103:
master: 0
name: node2
10.2.0.31:
master: 1
name: node1
Part of this process involves setting up DNS entries that are used later. Notice that I'm referencing the nodes with their private IPs, and will use bastion to route Ansible connection over it to those nodes.
Standing up the GlusterFS and CTDB/Samba cluster with Ansible
Now that nodes are up and running, time to execute the Ansible playbook. This stage does following:
- Preps the disks (partition, format, mount)
- Updates hosts file with known nodes (so we can reference them by their short names such as node1)
- Installs necessary tools (GlusterFS, Samba, CTDB)
- Creates GlusterFS volume, starts it and mounts it to a known path
- Creates users dedicated for accessing the share, starts the CTDB/Samba, adds aforementioned users into Samba db
After running the playbook.yml
... playbook, we should end up with a working SMB cluster. If we try mapping it to a Windows box with PowerShell:
PS C:\Windows\system32> $net = new-object -ComObject WScript.Network
PS C:\Windows\system32> $net.MapNetworkDrive("R:", "\\smb.example.com\share", $false, "username", "password")
PS C:\Windows\system32> R:
PS R:\> echo "Written from a windows 2022 bastion" > foo.txt
and check the file in the share
$ # in a node2
$ cat /mnt/data/share/foo.txt
Written from a windows 2022 bastion
... the Samba share should work as expected.
The Ansible playbook I used was put into a single file, in my later iteration I will organize it into roles and add some status checks for services. I'd like to also address CTDB cluster IP failover (moving the secondary IP to another host) and document steps for expanding the GlusterFS volumes (I imagine it's matter of expanding EBS volume, recognizing it in the partition and then updating the GlusterFS volume).
References
- https://www.gluster.org/windows-cifs-fileshares-using-glusterfs-and-ctdb-for-highly-available-data/
- https://docs.gluster.org/en/main/Administrator-Guide/Accessing-Gluster-from-Windows/#creating-a-thinly-provisioned-logical-volume