Chapter 26. Memory : Monitoring Usage and Tuning

Memory management

Objectives

  • Know how to use entries in
    • /proc/sys/vm
  • Decipher
    • /proc/meminfo
  • Use vmstat
    • To display information about memory, paging, I/O, processor, activity, and processes memory consumption
  • Understand
    • How the OOM-Killer decides when to take action and selects which processes should be exterminated to open up some memory

Memory Tuning considerations

Tunning the memory sub-system can be a complex process. First of all one has to take note that memory usage and I/O throughput are intrinsically related, as in most cases most memory is being used to cache the contents of files on disk.

Thus changing memory parameters can have a large effect on I/O performance, and changing I/O paramenters can have an equally large converse effect on the virtual memory sub-system.

When tweaking parameters in 

​/proc/sys/vm

​The usual best practice is to adjust one thing at a time and look for effects. The primary (inter-related) tasks are:

  • Controlling flushing parameters
    • How many pages are allowed to be dirty and how often they are flushed out to disk
  • Controlling swap behavior
    • How much pages that reflect file contents are allowed to remain in memory as opposed to those that need to be swapped out as they have no other backing store
  • Controlling how much memory overcommission is allowed
    • Since many programs never need the full amount of memory they request, particularly because of copy on write (COW) techniques

Memory tuning can often be subtle, and what works in one system situation or load may be far from optimal in other cicumstances.

Memory Monitoring Tools

Utility Purpose Package
free Brief summary of memory usage procps
vmstat Detailed virtual memory statistics and block I/O, dynamically updated procps
pmap Process memory map procps

/proc/sys/vm

This directory contains many tubale knobs to control the virtual memory system. Exactly what appears in this directory will depend somewhat on kernel version. Almost all of the entries are writable (by root).

Remember these values can be changed either by directly writing to the entry, or using the sysctl utility. Furthermore, by modifying 

​/etc/sysctl.conf

​values can be set at boot time

vmstat

Is a multipurpose tool that displays information about memory, paging, I/O, processor activity and processes. It has many options. The general form of the command is

$ vmstat [options] [delay] [count]

​If delay is given in seconds, the report is repeated at the interval count times. If count is not given vmstat will keep reporting statistics forever until it is killed by a signal, such as Ctl-c

sample

​$ vmstat 2 4

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 1631696  30828 650552    0    0   241    44   92  251  2  1 97  0  0
 0  0      0 1631696  30828 650560    0    0     0     0   79  281  2  1 98  0  0
 0  0      0 1631448  30836 650560    0    0     0     8  291  696 12  1 87  0  0
 2  0      0 1527064  31156 703748    0    0 26206     6 1234 2894 69 13  5 12  

Fields

Field Subfield Meaning
Processes r Number of processes waiting to be scheduled in
Processes b Number of processes in uninterruptible sleep
memory swpd Virtual memory used (KB)
memory free Free (idle) memory (KB)
memory buff Buffer memory (KB)
memory cache Cached memory (KB)
swap si Memory swapped in (KB)
swap so Memory swapped out (KB)
I/O bi Blocks written to devices (blocks/sec)
I/O bo Blocks read from devices (block/sec)
system in Interrupts/second
system cs Context switches/second
CPU us CPU time running user code (percentage)
CPU sy CPU time running kernel (system) code (percentage)
CPU id CPU time idle (percentage)
CPU wa Time waiting for I/O (percentage)
CPU st Time "stolen" from virtual machine (percentage)

If the option -S m is given, memory statistics will be in MB instead of KB.

Within the -a option, vmstat displays information about active and inactive memory

$ vmstat -a 2 4
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa st
 5  0      0 1340280 392304 1152588    0    0   215    41  126  430  3  1 96  0  0
 0  0      0 1340156 392304 1152600    0    0     0     0   56  171  1  1 98  1  0
 0  0      0 1340156 392304 1152560    0    0     0     0   52  156  2  0 98  0  0
 0  0      0 1340156 392304 1152564    0    0     0    22   52  164  1  1 98  0  0

Where active memory pages are those that have been used recently, they may be clean (disk content are up to date) or dirty (need to be flushed to disk eventually). By contrast inactive memory pages have not been recently used and are more likely to be clean and are released sooner under memory pressure.

Memory can move back and forth between active and inactive lists as they get newly referenced, or go a long time between uses.

To get a table of memory statistics and certain event counters use the -s option

$ vmstat -s
      3030444 K total memory
       967148 K used memory
      1200696 K active memory
       392660 K inactive memory
      1291556 K free memory
        32304 K buffer memory
       739436 K swap cache
            0 K total swap
            0 K used swap
            0 K free swap
        12365 non-nice user cpu ticks
          107 nice user cpu ticks
         2108 system cpu ticks
       341316 idle cpu ticks
          348 IO-wait cpu ticks
            0 IRQ cpu ticks
          251 softirq cpu ticks
            0 stolen cpu ticks
       695214 pages paged in
       133713 pages paged out
            0 pages swapped in
            0 pages swapped out
       447805 interrupts
      1575344 CPU context switches
   1470695353 boot time
         2906 forks

To get a table of disk statistics use the -d option:

$ vmstat -d

disk- ------------reads------------ ------------writes----------- -----IO------
       total merged sectors      ms  total merged sectors      ms    cur    sec
sda    19442   1192 1372984   30380   4219   4822  267642   27717      0     19
sdb      313      0    7968     254      1      0       2       0      0      0
sdc      193      0    3092     135      3      0       4       6      0      0
sdd      134      0    1934      86      3      0       4       6      0      0
sr0       20      0     128      18      0      0       0       0      0      0
dm-0   20303      0 1358378   42552   8825      0  267624   47023      0     19
dm-1      94      0    1264     134      0      0       0       0      0      0
loop0     54      2    2098     208      6      2      16      17      0      0
dm-2      61      0    2212      44      1      0       2       0      0      0
md0       65      0    1330       0      1      0       2       0      0      0
loop1     63      0    2225      42      0      0       0       0      0      0
dm-3      43      0    2072      39      0      0       0       0      0      0

Disk Fields
 

Field Subfield Meaning
reads total Total reads completed successfully
reads merged Grouped reads (resulting in on I/O)
reads ms Milliseconds spent reading
writes total Total writes completed successfully
writes merged Grouped writes (resulting in one I/O)
writes ms Milliseconds spent writing
I/O cur I/O in progress
I/O sec seconds spent for I/O

For a quick reference statistics in a define partition, use the -p option

$ vmstat -p /dev/sda1 2 4
sda1          reads   read sectors  writes    requested writes
                 119       8510          3         18
                 119       8510          3         18
                 119       8510          3         18
                 119       8510          3         18

/proc/meminfo

$ cat /proc/meminfo
MemTotal:        3030444 kB
MemFree:         1326732 kB
MemAvailable:    2037868 kB
Buffers:           32940 kB
Cached:           666512 kB
SwapCached:            0 kB
Active:          1164908 kB
Inactive:         393316 kB
Active(anon):     859588 kB
Inactive(anon):     1620 kB
Active(file):     305320 kB
Inactive(file):   391696 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        858768 kB
Mapped:           263828 kB
Shmem:              2440 kB
Slab:              78924 kB
SReclaimable:      39896 kB
SUnreclaim:        39028 kB
KernelStack:        7856 kB
PageTables:        38552 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     1515220 kB
Committed_AS:    4240404 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       10140 kB
VmallocChunk:   34359720316 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       98240 kB
DirectMap2M:     2996224 kB

Fields Description
 

Entry Meaning
MemTotal Total usable RAM (Physical minus some kernel reserved memory)
MemFree Free memory in both low and high zones
Buffers Memory used for temporary block I/O storage
Cached Page cache memory, mostly for file I/O
SwapCached Memory that was swapped back in but is still in the swap file
Active Recently used memory, not to be claimed first
Inactive Memory not recently used, more elegible for reclamation
Active (anon) Active memory for anonymous pages
Inactive (anon) Inactive memory for anonymous pages
Active (file) Active memory for file-backed pages
Inactive (file) Inactive memory for file-backed pages
Unevictable Pages which can not be swapped out of memory or released
Mlocked Pages which are locked in memory
SwapTotal Total swap space available
SwapFree Swap space not being used
Dirty Memory which needs to be written back to disk
Writeback Memory actively being written back to disk
AnonPages Non-file back pages in cache
Mapped Memory mapped pages, such as libraries
Shmem Pages used for shared memory
Slab Memory used in slabs
SReclaimable Cache memory in slabs that can be reclaimed
SUnreclaim Memory in slabs that can't be reclaimed
KernelStack Memory used in kernel stack
PageTables Memory being used by page table structures
Bounce Memory used for block device bounce buffers
WritebackTmp Memory used by FUSE filesystems for writeback buffers
CommitLimit Total memory available to be used, including overcommission
Committed_AS Total memory presently allocated, whether or not it is used
VmallocTotal Total memory available in kernel for vmalloc allocations
VmallocUsed Memory actually used by vmalloc allocations
VmallocChunk Largest possible contiguous vmalloc area
HugePages_Total Total size of the huge page pool
HugePages_Free Huge pages that are not yet allocated
HugePages_Rsvd Huge pages that have been reserved, but not yet used
HugePages_Surp Huge pages that are surplus, used for overcommission
Hugepagesize Size of a huge page

OOM Killer

Its a mechanism that decides which processes should be exterminated to open up some memory. This is activated when the available memory space is exhausted.

In order to determine which process shall be killed, there is a value called badness, which can be read from

/proc/[pid]/oom_score

Two entries in the same directory can be used to promote or demote the likelihood of extermination. The value of oom_adj is the number of bits the points should be adjusted by. Normal users can only increase badness, a decrease (a negative value in oom_adj) can only be set by superuser. Now oom_adj is deprecated and oom_adj_score is used instead.

Also Linux offers the possibility to use the overcommit memory mechanism in order to expand the memory usability beyond its capabilities.

The behavior of this mechanism can be tuned within

/proc/sys/vm/overcommit_memory
  • 0
    • Permit overcommision, but refuse obvious overcommits, and give root users somewhat more memory allocation than normal users.
  • 1
    • All memory requests are allowed to overcommit
  • 2
    • Turn off overcommission. Memory requests will fail when the total memory commit reaches the size of the swap space plus a configurable percentage (50 by default) of RAM. This factor can be modified changing
      • /proc/sys/vm/overcommit_ratio