No video during the flight

Don’t know what version of Linux they were running but looks like one of the following code paths triggered the issue:

static int pca953x_read_regs(struct pca953x_chip *chip, int reg, u8 *val)
{
	int ret;

	ret = chip->read_regs(chip, reg, val);
	if (ret < 0) { dev_err(&chip->client->dev, "failed reading register\n");
		return ret;
	}

	return 0;
}
static int pca953x_read_single(struct pca953x_chip *chip, int reg, u32 *val, int off)
{
	int ret;
	int bank_shift = fls((chip->gpio_chip.ngpio - 1) / BANK_SZ);
	int offset = off / BANK_SZ;

	ret = i2c_smbus_read_byte_data(chip->client,
				(reg << bank_shift) + offset);
	*val = ret;

	if (ret < 0) { dev_err(&chip->client->dev, "failed reading register\n");
		return ret;
	}

	return 0;
}
Posted on November 13, 2017 at 10:09 am by sergeyt · Permalink · Leave a comment
In: Life, Linux

Things to keep in mind about HTTP/2

This talk is just unbelievably helpful.

Posted on September 21, 2017 at 9:52 am by sergeyt · Permalink · Leave a comment
In: Uncategorized

MongoDB 3.4 or stay on 3.2?

If you’re herding multiple shards this one should be convincing enough to jump on 3.4 bandwagon:

mongos> sh.getBalancerHost()
getBalancerHost is deprecated starting version 3.4. 
The balancer is running on the config server primary host.
Posted on September 11, 2017 at 12:39 pm by sergeyt · Permalink · Leave a comment
In: MongoDB

Moving to OmniOS Community Edition

Had a small snag when I tried to upgrade my old (r151018) OmniOS installation to OmniOS CE as described in the ANNOUNCEMENT OmniOS Community Edition – OmniOSce r151022h

During “pkg update” stage I got something similar to the following:

pkg update: The certificate which issued this certificate:/C=US/ST=Maryland/O=OmniTI/OU=OmniOS/CN=OmniOS r151018 Release
Signing Certificate/emailAddress=omnios-supp…@omniti.com could not be found.

Thankfully, the solution was a straightforward sequence of steps to upgrade to r151020, then to r151021 and finally to r151022.
From there I was able to successfully upgrade to OmniOS CE. Even “-r” option in “pkg update -rv” worked as a charm because this option doesn’t exist in r151018. Probably, I could skip r151021 all together, but it’s always better be safe than sorry.

Posted on August 14, 2017 at 11:22 am by sergeyt · Permalink · Leave a comment
In: Solaris

How to reuse dropped sharded collection’s name

It happens that sometimes you want to drop your sharded collection and be able to reuse its name again. However, it might not be as straightforward as one expects it to be:

mongos>sh.shardColelction("your_database.your_collection", { "sharded_key": 1})

"code" : 13449,
"ok" : 0,
"errmsg" : "exception: collection your_database.your_collection already sharded"

The error message might be different but you get the idea – you can’t shared a collection if its name matches the one that has been recently dropped. Thankfully, there is a workaround described in SERVER-17397:

When dropping a collection:
use config
db.collections.remove( { _id: "DATABASE.COLLECTION" } )
db.chunks.remove( { ns: "DATABASE.COLLECTION" } )
db.locks.remove( { _id: "DATABASE.COLLECTION" } )
Connect to each mongos and run flushRouterConfig

Followed the steps in prod yesterday and it worked like a charm.

Posted on July 19, 2017 at 12:35 pm by sergeyt · Permalink · Leave a comment
In: MongoDB

TIL MongoDB Index Build could exceed 100%

A quote from SERVER-7631:

Since data can be inserted while its running, this can go over 100 by design.
Posted on July 4, 2017 at 2:23 pm by sergeyt · Permalink · Leave a comment
In: MongoDB

TIL Remove a Znode from Zookeeper

Yep, you could easily achieve that (and much more) using zkCli.sh (Zookeeper client):

$ /usr/share/zookeeper/bin/zkCli.sh 
Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0] help
ZooKeeper -server host:port cmd args
	connect host:port
	get path [watch]
	ls path [watch]
	set path data [version]
	rmr path
	delquota [-n|-b] path
	quit 
	printwatches on|off
	create [-s] [-e] path data acl
	stat path [watch]
	close 
	ls2 path [watch]
	history 
	listquota path
	setAcl path acl
	getAcl path
	sync path
	redo cmdno
	addauth scheme auth
	delete path [version]
	setquota -n|-b val path

Issue “rmr” (to remove recursively) or “delete” to remove a znode.

Posted on May 30, 2017 at 12:16 pm by sergeyt · Permalink · Leave a comment
In: TIL

TIL HSTS requires a secure transport

Otherwise (quoting RFC6797):

If an HTTP response is received over insecure transport, the UA MUST ignore any present STS header field(s).

That means SSL certificate on your server must be valid, i.e. no errors or warnings when you open a page from a browser over https.

Posted on May 25, 2017 at 2:25 pm by sergeyt · Permalink · Leave a comment
In: TIL

Restart your Mongos after maxConsecutiveFailedChecks

Take it literally.
If you configured your MongoDB config servers as a replica set and for some reason, say a network outage, Mongos server lost connection to all of them and is not able to reconnect during maxConsecutiveFailedChecks attempts then, surprise, it becomes useless. Even if the network is up and running again, Mongos will not reconnect to the config servers and you won’t be able to authenticate to your shard cluster until Mongos is restarted.

From https://api.mongodb.com/cplusplus/current/classmongo_1_1_replica_set_monitor.html

static int 	maxConsecutiveFailedChecks = 30
 	If a ReplicaSetMonitor has been refreshed more than this many times in a row without finding any live nodes claiming to be in the set, the ReplicaSetMonitorWatcher will stop periodic background refreshes of this set. 

And if you check the source code of 3.2.x (3.2.12 as of this writing) branch you will see the following (./src/mongo/client/replica_set_monitor.cpp):

if (_scan->foundAnyUpNodes) {
            _set->consecutiveFailedScans = 0;
        } else {
            _set->consecutiveFailedScans++;
            if (timeOutMonitoringReplicaSets) {
                warning() << "All nodes for set " << _set->name << " are down. "
                          << "This has happened for " << _set->consecutiveFailedScans
                          << " checks in a row. Polling will stop after "
                          << maxConsecutiveFailedChecks - _set->consecutiveFailedScans
                          << " more failed checks";
            }
        }

So once you go pass maxConsecutiveFailedChecks the replica set will become unusable:

bool SetState::isUsable() const {
    return consecutiveFailedScans < maxConsecutiveFailedChecks;
}

As far as I can't tell 3.4.x doesn't have maxConsecutiveFailedChecks and hopefully one will not have to intervene and restart Mongos manually.

Posted on February 9, 2017 at 5:03 pm by sergeyt · Permalink · Leave a comment
In: MongoDB

Watch “Monitorama 2016: All of Your Networking Monitoring is (probably) wrong” talk

Just came across this talk being mentioned in the comments on Hacker news and, boy, it’s absolutely amazing!
Watch this hilarious talk here – Monitorama 2016: All of Your Networking Monitoring is (probably) wrong

Btw, the talk is presented, presumably, by the same guy who wrote Monitoring and Tuning the Linux Networking Stack: Receiving Data and Monitoring and Tuning the Linux Networking Stack: Sending Data which are both must-read.

Posted on February 8, 2017 at 11:55 am by sergeyt · Permalink · Leave a comment
In: Linux