jbd2 is munching your disks? Use ftrace to find why.

Have you ever been wondering why jbd2 (or jbd if your are still using ext3) is sitting at the top of iotop and consuming the most of IO bandwidth? Well, it’s certainly not because it’s doing that just to drive you nuts but there is a reason. And the reason is most probably there is an app that is doing a lot of sys_fsync(), sys_fdatasync() or sys_msync().
In case your are not on the latest and greatest kernel and BPF is not available, there is an easy way to confirm that using ftrace.

Just enable tracing of ext4_sync_file_enter events:

# echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable

And print out the output from trace or trace_pipe (refer to the documentation of ftrace for more information):

# cat /sys/kernel/debug/tracing/trace/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 16/16   #P:8
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
          mongod-2299  [001] ...1 661508.531446: ext4_sync_file_enter: dev 252,1 ino 267191 parent 267956 datasync 1 
          mongod-2299  [003] ...1 661508.543931: ext4_sync_file_enter: dev 252,1 ino 267191 parent 267956 datasync 1 
          mongod-2299  [003] ...1 661508.566134: ext4_sync_file_enter: dev 252,1 ino 267191 parent 267956 datasync 1 
          mongod-2299  [003] ...1 661511.255926: ext4_sync_file_enter: dev 252,1 ino 267191 parent 267956 datasync 1 
          mongod-2299  [000] ...1 661511.703643: ext4_sync_file_enter: dev 252,1 ino 267191 parent 267956 datasync 1 

Once you are done, just stop tracing the events and clear ftrace’s buffer:

# echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable
# echo > /sys/kernel/debug/tracing/trace

So in my case jbd2 excessive activity was caused by MongoDB’s journaling which syncs data every 50ms (starting from version 3.2).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>