task blocked for more than 120 seconds

This was seen quite frequently on Centos 5 using software raid 1. These was only added since kernel 2.6.18.194. I found there are two causes: The first one is the same as described here http://blog.ronnyegner-consulting.de/2011/10/13/info-task-blocked-for-more-than-120-seconds/. Adjusting the following helps: lower vm.dirty_background_ratio, and either lower or increase vm.dirty_ratio. increase vm.dirty_ratio so that the sync io never happens and decrease it so that when sync io happens, it does not take as long.

The second cause seems to be related to barrier being enabled. This happens when IO is heavy on the partition mounted with barrier=1. After set barrier=0, I could not no longer reproduce the issue. I wonder if there is a bug/interaction when barrier=1.

df Showed More Disk Usage Than du

Process holding on to the FD of the file has been deleted
lsof -n -P | grep deleted
then kill processes holding on to the deleted file.

Monitor Dirty Cache

while true; do grep -A1 dirty /proc/vmstat; grep Dirty /proc/meminfo; sleep 1; done

FUTEX_WAIT blocking on



http://meenakshi02.wordpress.com/2011/02/02/strace-hanging-at-futex/

MySQL Tunning

mysqltunner.pl and tuning-primer.sh