Sometime ago had an issue with MySQL replication that consecutively faulted with two different errors. Initially with this one:
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted (you can check this by running ‘mysqlbinlog’ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.
Thankfully, this error was easy to fix with the help of change master SQL statement:
change master to master_log_file='mysql-bin.000047', master_log_pos=152667618;
However, the second error kicked in immediately right after that:
Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from impossible position; the first event ‘mysql-bin.000048’ at 223481321, the last event read from ‘/var/log/mysql/mysql-bin.000048’ at 4, the last byte read from ‘/var/log/mysql/mysql-bin.000048′ at 4.’
This one was also a no-brainer if you know how to fix them
Armed with mysqlbinlog it was easy to verify that there were no logs past 223481321 position:
mysqlbinlog --base64-output=decode-rows --verbose --start-position=223481321 ./mysql-bin.000048
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
Thus the proper solution in that case was to manually point the slave to the next available bin log:
change master to master_log_file='mysql-bin.000049', master_log_pos=4;
Just learnt that DTrace can’t be run at full pelt when rootless Mac OS X feature is enabled. For example, iotop ends like this:
: probe description io:::start does not match any probes
And it seems that the only way out is to boot into a rescue mode and run csrutil disable to turn off the protection mode. Only after that you could call your best friends like iotop, iosnoop and many others which are published at DTrace Book page.
Btw, if you haven’t bought it yet I highly recommend to do so.
Not a long ago I had one of those humiliating moments when a simple question makes you numb a or even worse – you begin to mumble an absolute rubbish. That what exactly what has happened to me recently and being an afterthought person (which, of course, doesn’t give me any advantage) I decided to do some homework/recap o the questions I’ve failed misarebly.
- Linux PIPE
– Read “man 2 pipe” as it basically says it all in a single sentene:
pipe() creates a pair of file descriptors, pointing to a pipe inode, and places them in the array pointed to by filedes. filedes is for reading, filedes is for writing.
– Want to go deeper then the source code is the best place to start:
- Linux VM overcommit
– Again, start from reading the documentation.
– Take a look at the code to figure out how the heuristic overcommit handling works. Especially, __vm_enough_memory() which is run by security_vm_enough_memory_mm(), which in turn could be called from different places, e.g. mmap_region(), acct_stack_growth(), do_brk(), insert_vm_struct(), dup_mmap().
– “man 3 malloc”, “man 2 mallopt”
– Go through do_brk() code.
– Read vm sysctl documentation about the swappiness parameter.
– swappiness comes into play in get_scan_count() which is called from shrink_lruvec().
– If the code looks murky, take a look at the answer published at unix.stackexchange.com which goes in a greater details about vm.swappiness.
– Read about Split LRU
And of course, buy, read and re-read Understanding the Linux Kernel even if it’s a bit dated.
I was really bad at googling the steps to grow ZFS rpool online without using zpool append, so here is my little story how I’ve done it.
Before I begin, please, note that everything said below applies only to the configurations where rpool consumes a whole disk. If there is another partition sitting right after (or with some gap) your rpool, you’re risking to corrupt the data. So don’t blindly use the last cylinder (or “$”) when it’s time to modify the partition table.
- Here is the rpool:
# zpool list rpool
NAME SIZE ALLOC FREE CAP HEALTH ALTROOT
rpool 31.1G 23.6G 7.52G 75% ONLINE -
- Saving the partition table first so later this information could be used to correctly re-lable (re-partition) the expanded disk. Btw, use “zpool status rpool -v” to find the device name:
# prtvtoc /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s2
* /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s2 partition map
* 512 bytes/sector
* 512 sectors/track
* 15 tracks/cylinder
* 7680 sectors/cylinder
* 8533 cylinders
* 8531 accessible cylinders
* 1: unmountable
* 10: read-only
* First Sector Last
* Partition Tag Flags Sector Count Sector Mount Directory
0 2 00 0 65518080 65518079
2 5 01 0 65518080 65518079
- Behind the scene LUN was expanded. To confirm that luxadm could be quite handy (pay attention to “Unformatted capacity”) field:
# luxadm display /dev/rdsk/c4t60060E80167D3C0000017D3C000010CAd0s2
DEVICE PROPERTIES for disk: /dev/rdsk/c4t60060E80167D3C0000017D3C000010CAd0s2
Product ID: OPEN-V -SUN
Serial Num: 50 17D3C10CA
Unformatted capacity: 36864.000 MBytes
Read Cache: Enabled
Minimum prefetch: 0x0
Maximum prefetch: 0x0
Device Type: Disk device
- Time for the scariest part, i.e. re-lable the disk by installing a new partition table.
To be able to do that a new disk’s geometry must be somehow conveyed to the format utility and that’s surprisingly easy to achieve. Just run format, select the disk and use “type” option to autoconfigure it:
Searching for disks...done
AVAILABLE DISK SELECTIONS:
Specify disk (enter its number): 1
/dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s0 is part of active ZFS pool rpool. Please see zpool(1M).
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
! - execute , then return
AVAILABLE DRIVE TYPES:
0. Auto configure
1. Quantum ProDrive 80S
2. Quantum ProDrive 105S
3. CDC Wren IV 94171-344
16. Zip 100
17. Zip 250
18. Peerless 10GB
Specify disk type (enter its number): 0
c4t60060E80167D3C0000017D3C000010CAd0: configured with capacity of 35.99GB
- Notice that now it’s configured with a new capacity.
- Don’t leave the format prompt yet since we are not done. Next step is to carve out the partition table. Remember, that I only had two partitions (0 for root and 2 for backup) and your situation might be completely different, so don’t copy/paste rashly.
- Use the numbers from the menu to select the partition you’re willing to modify:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
- For me, it was enough to set the size of all partitions to zero except the two: 0 (tagged as root) and 2 (tagged as backup). As you can see below wm (write-mountable) flag was set only for partition 0, whilst the rest have wu (write-unmountable):
Current partition table (unnamed):
Total disk cylinders available: 9828 + 2 (reserved cylinders)
Part Tag Flag Cylinders Size Blocks
0 root wm 0 - 9827 35.99GB (9828/0/0) 75479040
1 unassigned wu 0 0 (0/0/0) 0
2 backup wu 0 - 9827 35.99GB (9828/0/0) 75479040
3 unassigned wu 0 0 (0/0/0) 0
4 unassigned wu 0 0 (0/0/0) 0
5 unassigned wu 0 0 (0/0/0) 0
6 unassigned wu 0 0 (0/0/0) 0
7 unassigned wu 0 0 (0/0/0) 0
- Label the disk and quit the format tool:
Ready to label disk, continue? yes
- Finally it’s time to grow ZFS rpool and to confirm we’re golden:
# zpool online -e rpool /dev/dsk/c4t60060E80167D3C0000017D3C000010CAd0s0
# zpool list rpool
NAME SIZE ALLOC FREE CAP HEALTH ALTROOT
rpool 35.9G 23.6G 12.3G 65% ONLINE -
Good luck and safe expanding your pools.
Another astonishing appearance of Bryan Cantrill in BSD Now episode
Hot on the heels of the recent mysqldump I’ve been running on MySQL 5.5 which ended with the following error:
Got error: 2006: MySQL server has gone away when selecting the database
Turned out that this error is quite common and there is a quick work around for that – increase max_allowed_packet. In my case going from 16M to 128M was enough.
P.S. There is a helpful thread at DBA Stackexchange which is worth reading too.
For the very first time I had seen this error a couple of days ago:
mysql> show table status;
ERROR 1143 (42000): SELECT command denied to user ''@'some_host_name_here' for column 'sid' in table 'masking'
What the heck was that?!
Turned out that a colleague of mine was doing a cleanup in mysql.user table a day before and deleted a number of non-existent users. So what? Well, the database I was working with didn’t have only “plain” tables:
mysql> show full tables in database_name_here where table_type not like '%table%';
| Tables_in_infra | Table_type |
| v_pool | VIEW |
| v_pool_pivot | VIEW |
2 rows in set (0.00 sec)
Aha moment! So we had two views. Checked if there was a proper“DEFINER”:
mysql> select TABLE_NAME, DEFINER from information_schema.views;
| TABLE_NAME | DEFINER |
| v_pool | some_user@some_host |
| v_pool_pivot | some_user@some_host |
10 rows in set (0.00 sec)
Of course, that was exactly the user which was deleted and thankfully, that was very easy to fix.
Just had to run “show create view” query to figure out how the view was created in the first place:
mysql> show create view v_pool_pivot\G
*************************** 1. row ***************************
Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`some_user@some_host` SQL SECURITY DEFINER VIEW `v_pool_pivot` AS (select followed by several lines of spaghetti SQL)
1 row in set, 1 warning (0.00 sec)
And after that just altered it:
mysql> alter DEFINER=CURRENT_USER view v_pool_pivot AS (select ...)
The lastest episode of BSD Now (103) podcast brought in a fantastic and hilarious interview with Bryan Cantrill who is well known for his wit and right on the bullseye rants. It’s been awhile since I cried laughing so unquestionably this video is a highly recommended. Not to mention that his talk was very educational both from the technical (epoll, kqueue) and historical point of views. Bookmarked and added to the favorites.
This is a short write up if after a reset or a reboot your FreeBSD (or Linux) instance doesn’t come online, stalls and
“aws ec2 get-console-output” returns something like that among its lines:
UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
- Just power-off the faulty instance either from Web interface or using cli:
aws ec2 stop-instances --instance your_instance_id
- Again, using Web interface or cli (if you know its id) detach volume the root filesystem lives on.
aws ec2 detach-volume --volume-id faulty_volume_id --instance your_instance_id
- Create a new minimal instance and attach the volume that was detached during the previous step.
- Boot it up and simply run fsck manually as advised.
For FreeBSD you will have to add an entry into /etc/fstab otherwise fsck would complain:
# fsck -y /dev/xbd5a
fsck: Could not determine filesystem type
In my case I just add a single line:
echo "/dev/xbd5a /mnt ufs rw 1 1" >> /etc/fstab
- After that, just do the reverse: detach the volume and connect it back to your main instance and power it up.
Hope everything is golden at this point.
In the era of VSP G1000 this post may sound dated but I still hope that it would help some poor sole in the same situation I was sometime ago. The task was laughably simple: power off USP-V, power it up just to make sure it still could boot up flawlessly and shut it down again. All went nice and dandy till the final step – the ultimate power off. So I opened DKC panel and switched simultaneously two switches: PS-ENABLE and PS-OFF (all in accordance with the maintenance manual).
So far so good. During the first power off iteration it took the array (DKC and one DKU) roughly 15-20 minutes to cut off the power from its components so switches on AC boxes could be turned into OFF position. But not this time… After waiting for one hour and a half the system was still up. However, and thankfully the disks had been spun down successfully. That opened a door for a forceable shut down procedure. Which is simple and straightforward as a samurai’s sword.
- Open DKC panel and unscrew it as shown on the picture below.
- Pay attention to the jumpers. We will be using JP3 which is right above JP2.
- If you don’t have a jumper (I didn’t have one) there is also a workaround. Go to the back of a DKU, open its door and at the bottom there is a recess with another set of jumpers.
- It is save to pick and pull out any of these jumpers (remember that the disks had to be powered off before that).
I’ve been told that these jumpers define the physical position of a DKU rack relatively to a DKC (is it on the left or right and how far DKU-R1, DKU-L1, etc.)
- Once you have a jumper put it into JP3 in the DKC panel and turn CHK RST switch on but pressing on its upper half.
- A moment after that the array would be shut down.