王福强(Darren.Wang)
Table of Contents
- 1. What? What's HandlerSocket?(啥?啥是HandlerSocket?)
- 2. Things You should know
-
- 2.1. Binlog is still available with Handlersocket
- 2.2. What if I only want to update specific columns of a table instead of all?
- 2.3. pst_id will be confined in each connection, it's mapped to prep_stat in HS codebase;
- 2.4. QueryCache invalidation issue
- 2.5. encoded string issue
- 2.6. Play around with HS via telnet
- 2.7. The Insert columns have nothing to do with the OpenIndex columns;
- 2.8. auto_increment indeed is the problem here with HS.
- 2.9. HandlerSocket doesn't support transaction
- 2.10. Why HandlerSocket is fast?
- 3. Last Words On This
关于Mysql HandlerSocket Plugin你不得不知的几件事儿
As you may know, after Mysql5.1, plugin mechanism is introduced into the mysql server.With Plugins, we can extend or customize the functionalities or behaviors of Mysql server. While HandlerSocket is one custom plugin for Mysql, it enables you to access the underlying storage engines of Mysql server(currently, only InnoDB is supported), without any overhead on Sql parsing things, execution planning things, etc. As the author of HandlerSocket states, with HandlerSocket, you can achieve 750000 qps, sounds fantasitic ha? To find more details on HandlerSocket, refer to this original blog about HS. Of course, you can also refer to the official site of HS on the github where you can find almost anything and get you updated with the current development process of HS.
Oh, By the way, after you want to get started with HS, and you would like to read something useful on HS in Chinese, read this , hehe, one of my old fellow wrote it.
OK, now, let's get to the topic today.
I am sorry I said binlog is not availabe when we use HandlerSocket to do data access, that's not true. In fact, HandlerSocket implements as a Handler in mysql, and that means a callback for binlog writing needs to be implemented, so HandlerSocket will still write binlogs in the process of data access. One word, HandlerSocket will write binlog in row-based format.
When only want to update several columns, open an index with these several columns only. This is a trick, but it's necessary for you to know it.
So as long as you keep the pst_id identical in one connection, the index will not be crashed. Most of the time, The client should take care of the pst_id/indexId generation and management
Currently, HandlerSocket will not invalidate the query cache when updating the database, but the auther told me in the github forum that he would like to add this feature in the near futuer, maybe it will come soon;
(Update: the new version of HS has fixed this issure)
(Update: the new version of HS has fixed this issure)
When encoding the string as per the HS protocol, you have to convert a string to an array of bytes, but the protocol doesn't mention the conversion charset, as I have asked such a question in the forum of github, the author answered we should use the encoding of the target table.
As to the string encoding, another point I should mention, as the protocol states, the client should encoding column values before sending them to HS server, but since special bytes are kept(e.g. 0x00 as null, 0x09 as delimeter, etc.), so the protocol states that if the byte value is between 0x00 and 0x0f, special treat should be taken, that's, encoding these special bytes by prefixing 0x01 after bitwise 0x40. But as I know, most of the Java clients didn't pay attention to the encoding issue(Most of the time, I live in Java world, Although recently I starts to join Scala's one), so take care on this point when you try to pick up an Opensource Java client for HandlerSocket.
HandlerSocket protocol is a small-sized text based protocol. Like memcached text protocol, you can use telnet to get rows through HandlerSocket.
So let's have some fun.
Firstly, let's create a table to play with:
create table dw( id int(10) unsigned auto_increment not null, value varchar(25), primary key(id)) engine=InnoDB;
Then, here is what we may do next:
Example 1.
fujohnwangs-MacBook-Pro:~ fujohnwang$ telnet 10.16.201.39 9999 Trying 10.16.201.39... Connected to 10.16.201.39. Escape character is '^]'. P 0 test dw PRIMARY value 0 1 0 + 2 1 darren 0 1 0 = 1 0 1 0 D 0 1 1
Just for hint, the delimeter between value columns above is tab(0x09), not space, just as the protocol states. But in your first time, you may ignore this and cause unsuccessful interaction with HS.
Let's say, we have a table which has 2 columns (id int unsigned PRIMARY KEY auth_increment, value VARCHAR(25)), since we define id as auto_increment, when insert records into this table, we may only care about the value column, so we may do:
what? the error code is 1, not 0, it means there is something wrong. yes, Not like Update operations, the insert operation will ALWAYS insert columns from the 1st column to the last column of the table, even we only Open index with value column. In the above sample, the operation in fact try to insert stringvalue as int value which indeed will cause error.
Note
In fact, sometimes stringvalue will be converted to 0 instead of raising error. But the gotcha with insert is the key point here.
why? see item above. ^_^ Of course, most of the times, it's not a big deal as per HS's typical usage scenarios. In distributed environments, auth_increment is not prefered.
(Update: auto_increment works now)
(Update: auto_increment works now)
But the data changes are durable. Use HandlerSocket properly and visely in suitable scenarios.
The original blog has explanation about this, here is just a conclusion:
- much lower CPU usage (analysis with oprofile)
- execute multiple requests in bulk on server side which cause low CPU/disk usage
- custom effecient text-based communication protocol, at least more effecient than mysql's one
没有评论:
发表评论