How much memory does ZFS consume?

Moderator: cah

Post Reply
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

How much memory does ZFS consume?

Post by cah »

http://milek.blogspot.com/2006/09/how-m ... nsume.html

When using ZFS standard tools give inaccurate values for free memory as ZFS doesn't use normal page cache and rather allocates directly kernel memory. When low-memory condition occurs ZFS should free its buffer memory. So how to get how much additional memory is possibly free?

Code: Select all

bash-3.00# mdb -k
Loading modules: [ unix krtld genunix specfs dtrace cpu.AuthenticAMD.15 ufs md ip sctp usba fcp fctl lofs zfs random nfs crypto fcip cpc logindmux ptm ipc ]
> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     859062              3355   41%
Anon                       675625              2639   32%
Exec and libs                7994                31    0%
Page cache                  39319               153    2%
Free (cachelist)           110881               433    5%
Free (freelist)            385592              1506   19%

Total                     2078473              8119
Physical                  2049122              8004
>::quit
bash-3.00# echo "::kmastat"|mdb -k|grep zio_buf|awk 'BEGIN {c=0} {c=c+$5} END {print c}'
2923298816

[[ What is zio_bud for??? ]]
I believe it is for ZFS I/O buffer.

So kernel consumes about 3.2GB of memory and about 2.7GB is allocated to ZFS buffers and basically should be treated as free memory. Approximately free memory on this host is: Free (cachelist) + Free (freelist) + 2923298816. --> WHY???

I guess ZFS I/O buffer can be treated as FREE memory when needed.
CAH, The Great
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

Re: How much memory does ZFS consume?

Post by cah »

I have created a script to calculate the true free memory under ZFS systems:

Code: Select all

#!/bin/ksh

# ZFS I/O buffer (can be treated as free memory)
ZIO_BUF=`echo "::kmastat"|mdb -k|grep zio_buf|awk 'BEGIN {c=0} {c=c+$5} END {pri
nt c}'`
((ZFS_BUF = $ZIO_BUF / 1024 / 1024 ))

# Free memory from cachelist and freelist out of memstat
FREE=`echo "::memstat"|mdb -k|grep Free|awk 'BEGIN {c=0} {c=c+$4} END {print c}'
`

((FREE_MEM = $ZFS_BUF + $FREE ))
echo "Free memory = $FREE_MEM MB"
The running result:

1. UFS system has NO memory in buffer and therefore runs the same script faster. (AMD X.XX GHz CPU)

Code: Select all

%date;true_free_mem.sh;date
Wed May 27 20:31:15 EDT 2009
Free memory = 870 MB
Wed May 27 20:31:26 EDT 2009
2. ZFS system needs to calculate zio_buf and takes longer to come out with the result. (Intel Xeon E5410 2.33 GHz Quad-Core CPU)

Code: Select all

%date;true_free_mem.sh;date
Wed May 27 17:32:19 PDT 2009
Free memory = 528 MB
Wed May 27 17:32:55 PDT 2009
Conclusion: ZFS takes some memory and it takes time to get the information out.
CAH, The Great
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

Where's all my memory gone? Solaris 10 ARC memory usage with

Post by cah »

http://southbrain.com/south/2008/04/whe ... -sola.html

On the other day, I just wanted to know how many memory my system is using really for its purposes. The modular debugger "mdb" has a nifty macro for it: it is called memstat - really straightforward.

This is the result:

Code: Select all

# mdb -k
Loading modules: [ unix krtld genunix specfs dtrace cpu.AuthenticAMD.15 uppc pcplusmp ufs mpt ip hook neti sctp arp usba fcp fctl qlc lofs fcip cpc random crypto zfs logindmux ptm nfs ]
> ::memstat
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                    4111973             16062   78%
Anon                       251805               983    5%
Exec and libs                6346                24    0%
Page cache                  37719               147    1%
Free (cachelist)           285302              1114    5%
Free (freelist)            547571              2138   10%

Total                     5240716             20471
Physical                  5121357             20005
Weird - a 16 GB Kernel? Yes, that is normal as we are using ZFS as a filesystem. And its cache is stored in kernel memory. You may use the famous arcstats.pl perl program from Neelakanth Nadgir to get detailed ARC statistics but to understand a little bit you also may start with the Sun::Solaris Perl-Modules shipped with Solaris 10.

For our ARC statistics we have to use Sun::Solaris::Kstat:

Code: Select all

#!/usr/bin/perl

use Sun::Solaris::Kstat;

my $k=new Sun::Solaris::Kstat || die "No Kernel statistics module available.";

while (1)
{
  $k->update();
  my $kstats = $k -> {zfs}{0}{arcstats};
  my %khash = %$kstats;

  foreach my $key (keys %khash)
  {
    printf "%-25s = %-20s\n",$key,$khash{$key};
  }
  print "----------\n";
  sleep 5;
}
This example will print out something like this every 5 seconds:

Code: Select all

mru_ghost_hits            = 31005
crtime                    = 134.940576581
demand_metadata_hits      = 7307803
c_min                     = 670811648
mru_hits                  = 4479479
demand_data_misses        = 1616108
hash_elements_max         = 1059239
c_max                     = 10737418240
size                      = 10737420288
prefetch_metadata_misses  = 0
hits                      = 14405090
hash_elements             = 940483
mfu_hits                  = 9925611
prefetch_data_hits        = 0
prefetch_metadata_hits    = 0
hash_collisions           = 2486320
demand_data_hits          = 7097287
hash_chains               = 280319
deleted                   = 1301979
misses                    = 2263351
demand_metadata_misses    = 647243
evict_skip                = 47
p                         = 10211474432
c                         = 10737418240
prefetch_data_misses      = 0
recycle_miss              = 595519
hash_chain_max            = 11
class                     = misc
snaptime                  = 12168.15032689
mutex_miss                = 13682
mfu_ghost_hits            = 139332
You can see fields for the size ("size"), the maximum size ("c_max") - this is set in my case to 10GB via

Code: Select all

set zfs:zfs_arc_max=0x280000000
in /etc/system.

You see the counters for hits, misses, metadata misses, and so on. To get values per time unit, just take differences between to measures and format them - or just just use arcstats.pl, which will yield in an output like this:

Code: Select all

# /var/home/pascal/bin/arcstat.pl
    Time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
10:45:21   16M    2M     13    2M   13     0    0  653K    8    10G   10G
10:45:22   512   196     38   196   38     0    0    96   57    10G   10G
10:45:23   736   219     29   219   29     0    0    76   27    10G   10G
10:45:24   647   210     32   210   32     0    0    74   39    10G   10G

So - at the end - we know that 10 GB of the 16 GB kernel memory is used for the ZFS cache.

Editor note, July 16th 2008: Yes, yes, yes, you are right, there is "kstat" available as a command (/usr/bin/kstat) and you can write:

Code: Select all

kstat zfs:0:arcstats:size
to get the actual ARC cache size.
Just take a look at the kstat program, it is written in perl and using... Sun::Solaris::Kstat to retrieve the values... :)
CAH, The Great
cah
General of the Army / Fleet Admiral / General of the Air Force
General of the Army / Fleet Admiral / General of the Air Force
Posts: 1342
Joined: Sun Aug 17, 2008 5:05 am

ZFS Best Practices Guide

Post by cah »

http://www.solarisinternals.com/wiki/in ... ices_Guide
Excerpted from ZFS Best Practices Guide wrote: Memory and Dynamic Reconfiguration Recommendations

The ZFS adaptive replacement cache (ARC) tries to use most of a system's available memory to cache file system data. The default is to use all of physical memory except 1 Gbyte. As memory pressure increases, the ARC relinquishes memory.

Consider limiting the maximum ARC memory footprint in the following situations:
  • When a known amount of memory is always required by an application. Databases often fall into this category.
  • On platforms that support dynamic reconfiguration of memory boards, to prevent ZFS from growing the kernel cage onto all boards.
  • A system that requires large memory pages might also benefit from limiting the ZFS cache, which tends to breakdown large pages into base pages.
  • Finally, if the system is running another non-ZFS file system, in addition to ZFS, it is advisable to leave some free memory to host that other file system's caches.
The trade off is to consider that limiting this memory footprint means that the ARC is unable to cache as much file system data, and this limit could impact performance.

In general, limiting the ARC is wasteful if the memory that now goes unused by ZFS is also unused by other system components. Note that non-ZFS file systems typically manage to cache data in what is nevertheless reported as free memory by the system.
CAH, The Great
Post Reply