ZFS@Home

rsatd · 27 Jan 2009 at 17:42

BillytheImpaler said:
Oh yeah, I'm well aware that it's overkill. My current fileserver/Myth backend that is more than up to the task is an 800 MHz PIII. Spec'ing out the rig made me realize that for about $100 more I can have a relatively high-spec machine with all the modern goodies. For instance, the difference between 1 GiB of DDR2 and 4 GiB is $7. The difference between a single core 45W CPU and a dual core 45W CPU 200 MHz faster is $4. I'll pay those. The biggest part of that cost by far is the disks.

But its unless you will be using it for something other then a server, there really is no point what so ever. I can't see a home server needing a dual core. any time in the future before we all get some sort of brain implants. Saying that I also ended up looking for performance per cost when I was matching up what to get and constantly upgrading to the next "level" just because it was under £5 difference. Do that enough and you end up with a pretty pricey box.

Actually was about to ask where you found those 45W chips, but then saw you are using $... That's so unfair, they never release anything I am looking for in UK

. Well at least when I was looking for low powered models.

BillytheImpaler · 27 Jan 2009 at 17:50

My frontend is actually a 35W "Enegry Efificient" dual core AMD. It's diskless so other than one whisper quiet undervolted 120 mm Yate Loon it's silent.

It makes a difference in the power bill every month and will quickly pay for itself when compared to 65 or 95 watt CPUs. As for deciding between single and dual core procs at the same TDP having another processor to fall back on means that for the cost of a sandwich at a restaurant I can have a rig much more capable of media encoding and it'd be more future-proof anyway if I decide to re-purpose it in a pinch as a desktop or frontend.

rick827 · 28 Jan 2009 at 08:24

BillytheImpaler said:
I'm still struggling with the concept of vdevs as subunits in a raidz. If I had 12 disks why would I want to use vdevs of, say, 3 disks each instead of one giant vdev. Are there performance, security, or capacity benefits to either situation? Is it related to the hardware controller that the disks are plugged into, i.e. four 3-port SATA cards means you should use 4 3-disk vdevs?

If i'm following correctly I think you may have muddled the terminology 'vdev'. A vdev is the unit you add to a zfs array, these are usually whole disks (or can be partitions or even reserved space in a file). The collection of vdevs creates a pool (zpool) of disks, in our case setup as a raidz pool. We then create volumes on the pool with the zfs filesystem

They advise for raidz to keep the pool under 9 vdevs, any more and the overhead is detrimental compared to splitting the pool to 2x 5 vdevs but other than that theres no reason why you cant use one huge pool

rsatd · 28 Jan 2009 at 09:59

BillytheImpaler said:
My frontend is actually a 35W "Enegry Efificient" dual core AMD. It's diskless so other than one whisper quiet undervolted 120 mm Yate Loon it's silent.

It sucks to live in UK, best I found when I was still looking for such things were the 65W models

rick827 · 28 Jan 2009 at 10:20

I think the X2 3800+ is 35W and the BE-2300 is 45W, not sure on the newer CPUs as i've not looked for a while

BillytheImpaler · 28 Jan 2009 at 18:06

rick827 said:
If i'm following correctly I think you may have muddled the terminology 'vdev'. A vdev is the unit you add to a zfs array, these are usually whole disks (or can be partitions or even reserved space in a file). The collection of vdevs creates a pool (zpool) of disks, in our case setup as a raidz pool. We then create volumes on the pool with the zfs filesystem

So in the aforementioned example:

Code:

zpool create -f mediablob raidz c2t0d0p0 c3t0d0p0 c4t0d0p0 raidz c5t0d0p0 c6t0d0p0 c7t0d0p0 raidz c8t0d0p0 c9t0d0p0 c10t0d0p0 raidz c11t0d0p0 c12t0d0p0 c13t0d0p0

The group

Code:

raidz c2t0d0p0 c3t0d0p0 c4t0d0p0

constitutes a vdev, right? Three physical devices are combined to form one virtual device.

How is rudundancy treated for different vdevs within the same pool?

Might this be a way to add capacity to a machine without having to copy all the data in a pool to a separate machine?

For example, I might have a pool consisting of three 1 TB disks formed with the command

Code:

zpool create -f mediablob raidz c2t0d0p0 c3t0d0p0 c4t0d0p0

Let's say down the road I want to add three 1.5 TB disks to the mix. As above I can't expand the current vdev (assuming the question I asked first in this post deserves an affirmative answer), but I can add more vdevs to the same pool. I could add, let's say, c5t0d0p0 c6t0d0p0 c7t0d0p0, my new disks, to the pool mediablob and it would handle that.

The first, preexisting vdev would continue to do its thing, holding parity data for data stored on it. The second vdev I just added to the pool would be allocated data as the file system deems necessary and it would hold all parity data for information in its group. To see the benefits of ZFS's RAID 5-like structure I would always have to expand my storage by at least 3 disks at a time because having a vdev of just 2 devices produces no additional performance and has an overhead cost associated with it.

Or not.

Please tell me if^H^Hwhere I'm wrong.

BillytheImpaler · 28 Jan 2009 at 18:10

rsatd said:
It sucks to live in UK, best I found when I was still looking for such things were the 65W models

If you're still looking the AMD part number for the 2 GHz 35W X2 3800+ is ADD3800CUBOX.

rick827 · 29 Jan 2009 at 09:21

BillytheImpaler said:
So in the aforementioned example:

Code:

zpool create -f mediablob raidz c2t0d0p0 c3t0d0p0 c4t0d0p0 raidz c5t0d0p0 c6t0d0p0 c7t0d0p0 raidz c8t0d0p0 c9t0d0p0 c10t0d0p0 raidz c11t0d0p0 c12t0d0p0 c13t0d0p0

The group

Code:

raidz c2t0d0p0 c3t0d0p0 c4t0d0p0

constitutes a vdev, right? Three physical devices are combined to form one virtual device.

Correct yes

BillytheImpaler said:
How is rudundancy treated for different vdevs within the same pool?

I've struggled to find much info on this, but the data is 'dynamically striped' over the available vdevs in that pool, with the stripe width calculated on write.

Here's all I could find: http://docs.sun.com/app/docs/doc/819-5461/gazdd?a=view

BillytheImpaler said:
Might this be a way to add capacity to a machine without having to copy all the data in a pool to a separate machine?

For example, I might have a pool consisting of three 1 TB disks formed with the command

Code:

zpool create -f mediablob raidz c2t0d0p0 c3t0d0p0 c4t0d0p0

Let's say down the road I want to add three 1.5 TB disks to the mix. As above I can't expand the current vdev (assuming the question I asked first in this post deserves an affirmative answer), but I can add more vdevs to the same pool. I could add, let's say, c5t0d0p0 c6t0d0p0 c7t0d0p0, my new disks, to the pool mediablob and it would handle that.

The first, preexisting vdev would continue to do its thing, holding parity data for data stored on it. The second vdev I just added to the pool would be allocated data as the file system deems necessary and it would hold all parity data for information in its group. To see the benefits of ZFS's RAID 5-like structure I would always have to expand my storage by at least 3 disks at a time because having a vdev of just 2 devices produces no additional performance and has an overhead cost associated with it.

Or not. Please tell me if^H^Hwhere I'm wrong.

Yes, you can do say..

Code:

zpool create -f mediablob raidz c2t0d0p0 c3t0d0p0 c4t0d0p0

then..

Code:

zpool add mediablob raidz c5t0d0p0 c6t0d0p0 c7t0d0p0

Giving you twice the disk space, with 2 vdevs rather than one. Each vdev has raidz redundancy independant of the other, so you could lose c2t0d0p0 and c5t0d0p0 without data loss. Obviously losing 2 disks in the same vdev will result in loss of data. I think if your 1st vdev was very full and you added a second matching one, then it would take a long time to build the new pool as half the data from vdev1 needs to be copied to vdev2 so they're striped. Also yes I think you'd have to add the new vdevs with matching 3 disk setups.

I might have a play later on, as its *nix its very flexible and you can just

Code:

mkfile 100m /disk/disk1
mkfile 100m /disk/disk2
mkfile 100m /disk/disk3
zpool create test raidz /disk/disk1 /disk/disk2 /disk/disk3

..etc

rick827 · 29 Jan 2009 at 09:41

Hmm, well I had a go..

Code:

# mkfile 100m disk1
# mkfile 100m disk2
# mkfile 100m disk3
# mkfile 100m disk4
# mkfile 100m disk5
# mkfile 100m disk6
# mkfile 100m disk7
# mkfile 100m disk8
# mkfile 100m disk9
# mkfile 100m disk10
# mkfile 100m disk11
# mkfile 100m disk12

then..

Code:

zpool create test raidz /disks/disk1 /disks/disk2 /disks/disk3 raidz /disks/disk4 /disks/disk5 /disks/disk6 raidz /disks/disk7 /disks/disk8 /disks/disk9

Code:

# zpool list
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
test    858M   159K   858M     0%  ONLINE  -

# zpool status
  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME              STATE     READ WRITE CKSUM
        test              ONLINE       0     0     0
          raidz1          ONLINE       0     0     0
            /disks/disk1  ONLINE       0     0     0
            /disks/disk2  ONLINE       0     0     0
            /disks/disk3  ONLINE       0     0     0
          raidz1          ONLINE       0     0     0
            /disks/disk4  ONLINE       0     0     0
            /disks/disk5  ONLINE       0     0     0
            /disks/disk6  ONLINE       0     0     0
          raidz1          ONLINE       0     0     0
            /disks/disk7  ONLINE       0     0     0
            /disks/disk8  ONLINE       0     0     0
            /disks/disk9  ONLINE       0     0     0

errors: No known data errors

Try adding just 2 disks..

Code:

# zpool add test raidz /disks/disk10 /disks/disk11
invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: pool uses 3-way raidz and new vdev uses 2-way raidz

Try force:

Code:

# zpool add -f test raidz /disks/disk10 /disks/disk11

# zpool list
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
test   1.02G   147K  1.02G     0%  ONLINE  -

# zpool status

  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME               STATE     READ WRITE CKSUM
        test               ONLINE       0     0     0
          raidz1           ONLINE       0     0     0
            /disks/disk1   ONLINE       0     0     0
            /disks/disk2   ONLINE       0     0     0
            /disks/disk3   ONLINE       0     0     0
          raidz1           ONLINE       0     0     0
            /disks/disk4   ONLINE       0     0     0
            /disks/disk5   ONLINE       0     0     0
            /disks/disk6   ONLINE       0     0     0
          raidz1           ONLINE       0     0     0
            /disks/disk7   ONLINE       0     0     0
            /disks/disk8   ONLINE       0     0     0
            /disks/disk9   ONLINE       0     0     0
          raidz1           ONLINE       0     0     0
            /disks/disk10  ONLINE       0     0     0
            /disks/disk11  ONLINE       0     0     0

errors: No known data errors

So it took it and the extra space seems to be available, but i guess there is some disadvantage or why the need to force it?

Edit: http://docs.sun.com/app/docs/doc/819-5461/6n7ht6qvk?a=view

While this configuration is allowed, mismatched levels of redundancy result in unused space on the larger device, and requires the -f option to override the warning.

rsatd · 29 Jan 2009 at 10:01

BillytheImpaler said:
If you're still looking the AMD part number for the 2 GHz 35W X2 3800+ is ADD3800CUBOX.

Part number was the easy bit, finding one in a shop wasn't

But now I am enjoying a VIA board that allegedly uses 7W. Does the job if a tad slow with crappy flash sites.

yashiro · 29 Jan 2009 at 12:20

Use Lynx and never be troubled by Flash again!

rsatd · 29 Jan 2009 at 12:26

I just don't go to flash heavy sites. Most of them are crap and useless anyway. Besides a GUI browser is faster because you can use a mouse

.

Unless Lynx can use a mouse too these days

BillytheImpaler · 29 Jan 2009 at 17:06

Wow, thank you so much for that pair of fantastic posts, rick827. From reading those and the documentation I think this project is back on track. I'm A-Ok with adding storage in groups of 3. I'm ditching Nexenta right now in favor of SXCE to see if I like it better. When I'm all done I'll have a play in similar fashion.

My finger's hovering over the "Buy" button. I can't hold it much longer!

rick827 · 29 Jan 2009 at 17:27

Np

Out of interest, any reason for choosing SXCE over opensolaris 2008.11?

What im trying to do at the moment now I have all my large data/media files fairly redundant on there, is setup some kind of differential rsync on a cron to diff's of files over the network to the ZFS machine. That way I have a redundant backup of all my configs etc (essentially a hard disk mirror) and I also gain the benefit of the new timeslider function so I get daily/weekly/monthly snapshots of the volume.

So reverting to e.g. my old conky config file I scrapped last week but now decide I actually prefer is a doddle

Just need to decide the best method of copying and what I should backup

BillytheImpaler · 29 Jan 2009 at 17:40

None whatsoever. Should I go that route instead?

BillytheImpaler · 29 Jan 2009 at 18:29

Dear Lord, backspace doesn't even work! How do Solaris people live with this junk!?

rick827 · 29 Jan 2009 at 19:09

I'd go for opensolaris, its more polished. backspace should work as it's just the bash shell

BillytheImpaler · 29 Jan 2009 at 20:01

I reloaded the CD and am trying a complete installation rather than the Core one I used above. I also told it to use a few other English character sets hopefully one will work with my keyboard with no other futzing.

I'm definitely using OpenSolaris when I deploy this for real. Since I'm this far already I'll have a go with SXCE for my testing. Amusingly I've downloaded and burnt OpenSolaris before this full install is even 50% done.

yashiro · 29 Jan 2009 at 22:57

What im trying to do at the moment now I have all my large data/media files fairly redundant on there, is setup some kind of differential rsync on a cron to diff's of files over the network to the ZFS machine. That way I have a redundant backup of all my configs etc (essentially a hard disk mirror) and I also gain the benefit of the new timeslider function so I get daily/weekly/monthly snapshots of the volume.

So how do you retrieve your ZFS snapshots?
Ssh in and use commands, VNC in and use the gui or?

rick827 · 30 Jan 2009 at 09:56

yashiro said:
So how do you retrieve your ZFS snapshots?
Ssh in and use commands, VNC in and use the gui or?

Through SSH its very easy, once timeslider is setup it will take snapshots of your selected volumes at your selected time period (15mins, hourly, daily, weekly, monthly etc). You can then browse the snapshot via ssh like thus:

Code:

richard@SAN:/export/zfs# zfs list
NAME                        USED  AVAIL  REFER  MOUNTPOINT
tank/home                   542G   844G   541G  /export/zfs
.
.
richard@SAN:/export/zfs/.zfs# cd /export/zfs/.zfs/snapshot/
richard@SAN:/export/zfs/.zfs/snapshot# ls -l
total 236
drwxrwxrwx   3 100      sys            3 Jan  3 09:45 zfs-auto-snap:daily-2009-01-03-10:18
drwxrwxrwx   3 100      sys            3 Jan  3 14:18 zfs-auto-snap:daily-2009-01-04-00:00
drwxrwxrwx   3 100      sys            3 Jan  5 20:17 zfs-auto-snap:daily-2009-01-06-19:28
drwxrwxrwx   3 100      sys            3 Jan  5 20:17 zfs-auto-snap:daily-2009-01-07-00:00
drwxrwxrwx   3 100      sys            3 Jan  5 20:17 zfs-auto-snap:daily-2009-01-08-00:00
drwxrwxrwx   3 100      sys            3 Jan  5 20:17 zfs-auto-snap:daily-2009-01-09-00:00
richard@SAN:/export/zfs/.zfs/snapshot# cd zfs-auto-snap:daily-2009-01-03-10:18
richard@SAN:/export/zfs/.zfs/snapshot/zfs-auto-snap:daily-2009-01-03-10:18# ls -l
total 22
drwxrwxrwx+  8 richard  staff          8 Jan  2 11:43 Private

Find which file/folder you want and then copy it off the snapshot. Or you can use the timeslider GUI like this:

http://blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs

ZFS@Home

Man of Honour

Man of Honour

Man of Honour

Man of Honour

Man of Honour

Man of Honour

Man of Honour