In this blog, I am going to share the MySQL 8 InnoDB architecture diagram with related variables . From the MySQL official document I have seen some InnoDB variables related to MySQL 8.0.20 . The architecture has been prepared based on that . Also, I am providing the variables list which used in this diagram and its respective versions .
MySQL 8 InnoDB Architecture :
For understand the diagram better , I have added some old variables as well ( sync_binlog / innodb_log_buffer / innodb_flush_log_at_trx_commit / innodb_flush_method etc .. )
I have marked the MySQL 8 variables with yellow colour .
Few months back, I have released the MySQL 8 architecture diagram . But, the image quality is very low . So, some peoples personally asked me to send the original image through email . But, unfortunately I don’t have that. So, I had the plan long time to draw the diagram again and release it in my website. It Happens now 🙂 . Also, the current diagram have some modifications as well . Thanks to everyone , who encouraging me to draw this diagram again . I will be come up with my next blog as much as possible .
MySQL has the inbuilt query rewritten plugin for rewrite the SQL statements which received by the server before execute them .
Before MySQL 8.0.12 , the query rewritten plugin only supports for SELECT’s
From MySQL 8.0.12 , the query rewritten plugin will supports for INSERT , REPLACE, UPDATE, DELETE , SELECT .
One of our customer had the performance issue with the DELETE statements for particular table . The issue was happened after the new code deployment . In this blog, I am going to explain the implementation of the MySQL query rewrite plugin and how we overcome the issue using the MySQL query rewritten plugin .
MySQL Query rewritten plugin Implementation :
MySQL provides two SQL files for the installation / Uninstallation purpose . The files will be located under the directory “lc_messages_dir”
The file install_rewriter.sql for the installation purpose .
The file uninstall_rewriter.sql for the uninstallation purpose .
Once you loaded the file “install_rewriter.sql” , it will create a separate database / table / procedure for the query rewritten rules management .
database : query_rewrite
table : rewrite_rules
procedure : flush_rewrite_rules
Table structure :
The SQL rewritten rules needs to be configured on the table “rewrite_rules” .
Based on the query rules the server will rewrite the query .
All the columns have their own purpose .
Here is the simple example with query rewritten plugin .
pattern : SELECT ?
replacement : SELECT ? + 1
Hope this make sense !!
Note : It is always need to call the procedure “flush_rewrite_rules” . whenever you modified the rules table .
My problem statement with DELETE :
Last Friday One of our client had the new code deployment .
With the new code they mistakenly mentioned the UPPER function on one of the DELETE statement , actually which is not required .
So, the query was not able to use the INDEX as the function prevents the INDEX usage .
The table had around 30M records
Unfortunately modifying the code without UPPER case is not possible that time .
Fix : We have provided the fix by using the MySQL query rewritten plugin by adding the rules without the UPPER function .
It is restricted to share the production data , so I have reproduced the same issue at my local environment with less data .
Experiment at my local :
Created the table “jc_rewritten” and inserted some records .
DELETE with UPPER function ,
Here you can see the UPPER function is blocking the index for column “f_id” and the query doing FTS ( Full Table Scan ) . It is very painful for 30M table .
DELETE without UPPER function ,
Without UPPER function, the query is using the optimal index and does direct look up . It is very very fast .
Both the query will affect the same data ,
So, we have made the Query rewritten rule like below ,
just we are removing the UPPER function as it is not required .
After activated the query rule the SQL is running fine as expected . And, our performance issue was fixed .
You can see the warning message as how the server changed the query . Below is the logs from GENERAL query log ,
Some of the database tools ( i.e. ProxySQL ) have more features on this Query rewritten topic . MySQL inbuilt query rewritten plugin recently ( from MySQL 8.0.12 ) supporting for DELETE’s , UPDATE’s , INSERT’s . And, it helps a lot to solve our production issue . Expecting more features on upcoming versions . Special Thanks to MySQL community Team !! I will be come up with my next blog as much as possible .
MySQL have the nice feature, which helps to write the error log with JSON format . This can be achieved through the MySQL components . In this blog, I am going to show the complete walkthrough involved in this process .
What is MySQL component ?
from MySQL document ,
MySQL Server includes a component-based infrastructure for extending server capabilities. A component provides services that are available to the server and other components. (With respect to service use, the server is a component, equal to other components.) Components interact with each other only through the services they provide.
First step, we need to install the component “component_log_sink_json” .
install component ‘file://component_log_sink_json’; select * from mysql.component where component_urn like ‘%json%’\G
There is a table “mysql.component” , which will helps to obtain the status of the MySQL components .
Step 2 : ( Enabling the JSON service ) :
There are few services, which helps to design the error log format. The services based on the filter or sink . The variable “log_error_services” will helps to enable the required services .
Available services :
log_filter_internal ( filter )
log_filter_dragnet ( filter )
log_sink_internal ( sink )
log_sink_json ( sink )
log_sink_syseventlog ( sink )
log_sink_test ( sink )
For our purpose , I am going to enable the services “log_filter_internal” and “log_sink_json”
log_filter_interval : Implements filtering based on log event priority and error code, in combination with the log_error_verbosity and log_error_suppression_list system variable
log_sink_json : Implements the JSON logging in error log
set global log_error_services = ‘log_filter_internal; log_sink_json’;
From the above screenshot, you can see the new file ( mysql.err.00.json ) was created , when I enabled the JSON service .
I have manually created a error log event for testing purpose ( by executing the wrong command ) ,
Yes, it is working well !!
Make sure to have the variable ( log_error_services) entry in the MySQL configuration file ( my.cnf ) as well . You can also use the SET PERSIST .
Note : The log directory should have the proper privileges .
How to uninstall the JSON component ?
You cannot simply uninstall the component by using the UNINSTALL COMPONENT command . You need to first remove the JSON service from the variable “log_error_services” . Then only the mysqld will allow you to perform the UNINSTALL COMPONENT command .
uninstall component ‘file://component_log_sink_json’; set global log_error_services=’log_filter_internal; log_sink_internal’;
MySQL ERROR log will not contain any additional information with JSON format . But, you will have the name of each columns ( Time : xxxxxx , err_code : xxxx ) . And, the view is better . Hope this blog will helps to someone who loves JSON . I will coup with my next blog as much as possible .
We know well the MySQL Group Replication cluster is functionally working based on the GTID . GTID is mandatory for group replication . Usually, GTID is the combination of source_id ( or server_uuid ) and transaction_id . Do you think the same approach is following in the MySQL Group Replication as well ? Well, this blog mainly going to focus on the below three topics .
How the binary log GTID events are being generated in the normal MySQL instance ?
How the binary log GTID events are being generated in the MySQL Group Replication cluster ?
How to configure the asynchronous replication from Group replication cluster without single point of failure ?
How the binary log GTID events are being generated in normal MySQL instance ?
As I told, the Global Transaction Identifier ( GTID ) will be the combination of server_uuid and transaction_id . [server_uuid]:[transaction_id] . Let’s take the example, I have locally installed a MySQL instance with GTID .
From the above screenshot, we can see the binary log GTID events are represented with the combination of server_uuid and transaction_id .
How the binary log GTID events are being generated in MySQL Group Replication cluster ?
Now, Let’s take a look on the MySQL Group Replication servers . I have locally configured a three node setup .
At MySQL Group Replication, the GTID is not the combination of server_uuid and transaction_id . Instead of that, it is the combination of group_replication_group_name and transaction_id .
All the servers belongs to the particular group should have the same group_name. The value of group_replication_group_name must be a valid UUID. It can be generated with command uuidgen OR select uuid() ; .
Let’s see the example ,
at node1 ,
at node3 ,
from the above screenshots, we can clear the below things .
All the three servers in the same group can have different server UUID (server_uuid) .
All the three servers in the same group should have the same group name ( group_replication_group_name ).
All the three servers in the same group have the GTID events with the combination of [group_replication_group_name] : [transaction_id]
Clear, right ?
How to configure the asynchronous replication from Group replication cluster without single point of failure ?
The requirement is,
I have three node cluster sakthi1 , sakthi2 , sakthi3 . I wanted to configure the asynchronous slave ( D1 ) for DR / Reporting purpose .
Assume, D1( slave ) is configured as slave under the node sakthi2 ( master ) . If the master node sakthi2 fails and removed from the cluster, it should not affect the D1 replication . The server D1 still need to receive the live updates .
D1 should not be single point of failure until the complete cluster fails .
Is that possible ? Yes .
We can configure the MUTI-SOURCE replication with all the three nodes ( sakth1, sakthi2, sakthi3 ) as master from cluster . So, that if one node fails, the data will be keep apply from other nodes .
Replicating the same data from three different nodes will cause any replications errors ( DUPLICATE ENTRY ) ?
NO. Because the replication is happening based on the GTID and, the GTID is same in all nodes in cluster . The slave SQL thread will not execute the executed GTID events again and again .
As per the plan, I have created the replication with 3 channels ( dr1, dr2, dr3 ) .
The final output from my Asynchronous slave is ,
I hope this blog will help someone, who is working on the MySQL Group Replication . Will be come up with my next blog as much as possible .
As a MySQL database administrator, we all using the MySQL client program for communicate to MySQL Server . Maximum the client program is used to execute the SQL’s, monitor the traffic and modify the variables . MySQL client program has some good features which can helps to make our work easier . In this blog I am going to explain some MySQL client program features which impressed me .
Execute the OS commands inside the MySQL client program
Create / Execute the SQL file inside the MySQL client program
Query output with XML and HTML languages
MySQL client program for learning MySQL commands .
Pager with MySQL client program
Safe UPDATE / DELETE with MySQL client program
Define the maximum number of rows in result set ( SELECT’s )
Execute the OS commands inside the MySQL client program :
Yes, It is possible to execute the OS commands inside the MySQL client program . There is no need to come out from your MySQL terminal for execute the OS commands .
syntax : system <command name>
Create / Execute the SQL file inside the MySQL client program :
Yes, It is possible to create and execute the bunch of queries from SQL file . We can achieve this by using the command edit ( \e ).
syntax : edit
save the file using ” :wq!”
After save the file , put semi colon (;) and you can see the queries are executed .
Query output with XML and HTML languages :
Yes, We have the option to display the result set with different languages like XML and HTML. The output files can be used for their own purposes like sending the email with HTML output etc …
for XML ,
syntax : mysql –xml
syntax : mysql –html
MySQL client program for learning MySQL commands :
MySQL client program provides the help command with the option contents for the learning purpose . This is very helpful for learning the MySQL commands ( like Manual ) .
syntax : help contents
Here I have Chosen the Utility and EXPLAIN part from Utility .
Pager with MySQL client program :
The most common usage of pager is to set it to a Unix pager such as MORE,LESS,CAT . The below blog have the nice examples about the MySQL pager command .
pager for only print the particular command Query ,
pager for discard the result sets ,
Safe UPDATE / DELETE with MySQL client program :
MySQL client program provides the option “–safe-updates” . If this option is enabled, UPDATE and DELETE statements that do not use a key in the WHERE clause or a LIMIT clause produce an error.
Here the column batch_id don’t have the INDEX . Now, I am going to enable the ” –safe-updates ” and perform the UPDATE/DELETE based on the column batch_id=103 ( which don’t have index ).
Not working because the “–safe-updates” flag prevents .
Define the maximum number of rows in result set ( SELECT’s ) :
We can define the automatic number of rows limit for SELECT statements when using –safe-updates.
From the above example, it returns now row as per the configuration ( –select-limit=1 ) .
I hope every MySQL DBA has learned this things when they started their career with MySQL . Now, modern day we have the advanced client programs like “MySQL shell” . My personal intention of this post is to spread the good things of MySQL client program , because I personally love it lot . Hope this post helps someone, who is started to learn the MySQL .
My point of view PITR is the very required and important skill for the database administrator . Hope this blog will help someone who is looking for the recovery plan . I will come up with my next blog as much as possible .
My point of view PITR is the very required and important skill for database administrator . Hope this blog will help someone who is looking for the recovery plan . I will come up with my next blog as much as possible .
Yes, by default your InnoDB transactions don’t have the Atomicity . But, you can control the atomicity of your InnoDB transaction with the variable innodb_rollback_on_timeout. We had a weird data inconsistency issue within the transaction . In this blog , I am going to explain “How it will happen?” and “How it can be fixed ?” .
Whats is Atomicity ?
From Wiki ,
Transactions are often composed of multiple statements. Atomicity guarantees that each transaction is treated as a single “unit”, which either succeeds completely, or fails completely: if any of the statements constituting a transaction fails to complete, the entire transaction fails and the database is left unchanged.
What is Innodb_rollback_on_timeout ?
From MySQL documentation,
InnoDB rolls back only the last statement on a transaction timeout by default. If –innodb-rollback-on-timeout is specified, a transaction timeout causes InnoDB to abort and roll back the entire transaction.
Here the transaction timeout will be happen based on the “innodb_lock_wait_timeout” setting .
All good, now we can experiment how it is working practically .
Experiment 1 ( Innodb_rollback_on_timeout = OFF )
I have created two MySQL client sessions ,
session 1 :
Started a transaction
Ran the UPDATE to set name=’christ’ where id=1 ( not COMMIT )
now I am going to lock the second row ( id=2 ) at different session ( session 2 )
session 2 :
Started the transaction at session 2
Ran the UPDATE to set name=’sri’ where id=2 ( not COMMIT )
Now, I am going to update the same id = 2 at session 1 , so that it will exceed the innodb_lock_wait_timeout and the statement will fail .
session 1 :
Trying to access the row ( id=2 ) through the UPDATE
The UPDATE was failed with lock wait timeout because the session 2 is already hold that row .
Triggered the COMMIT
But after commit the transaction,
My first UPDATE was committed even the second statement was failed . You can see the output from the screenshot .
Sounds like ATOMICITY is missing ?
Lets do the second experiment …
Experiment 2 ( Innodb_rollback_on_timeout = ON ) :
I am going to repeat the same procedure with innodb_rollback_on_timeout=ON,
session 1 :
started the transaction and updated the row id=1
session 2 :
locking the row id=2
session 1 :
Creating the lock wait timeout inside the transaction and committing the transaction .
No changes happened . Make sense ?
With innodb_rollback_on_timeout = ON , Both the statements were not committed and the complete transaction has rolled back . THE ENTIRE TRANSACTION HAS ROLLBACK . So, this is the exact Atomicity right ?
Note : Changing the innodb_rollback_on_timeout requires the MySQL restart .
Hope this blog helps to understand the variable innodb_rollback_on_timeout . The blog proves, Having the innodb_rollback_on_timeout = ON provides the perfect/complete atomicity on your InnoDB system .
Had an interesting requirement from one of our client to have the two MySQL partitions ( partition_full / partition_half ) for store the Names based on the user input.
The table have two columns first_name and last_name. With the user input of both columns, it needs to be automatically compute the data for another column full_name . And, the status needs be consider as “FULL” .
If the column last_name don’t have the input from the user, then the first_name data needs to be compute as the full_name . And, the status needs be considered as “HALF” .
Need the separate partitions for both status HALF and FULL
We can achieve this with the help of Virtual / Generated columns and LIST partition . In this blog, I am going to explain the complete steps which I followed to achieve this .
What is Virtual Column ?
Virtual columns are the generated columns because the data set for these columns will be computed based on the predefined column structure . Below are the three types we can generate the virtual columns .
Here is the detailed blog post from Mydbops Team, which contains the nice details about Virtual columns .
MySQL partition with Virtual / Generated columns
Step 1 – ( Creating the table with virtual columns )
cmd : create table Virtual_partition_test (id int(11) not null auto_increment primary key, first_name varchar(16), last_name varchar(16) default 0, full_name varchar(32) as (case last_name when ‘0’ then first_name else concat(first_name,’ ‘,last_name) end) stored, name_stat varchar(7) as (case full_name when concat(first_name,’ ‘,last_name) then ‘full’ else ‘half’ end) stored, email_id varchar(16));
full_name – for compute the data set from columns firstname and lastname .
name_stat – for compute the name status from columns firstname and lastname .
Step 2 – ( Testing the virtual/generated column behaviour )
insert into Virtual_partition_test (first_name,last_name,email_id) values (‘sri’,’ram’,’firstname.lastname@example.org’),(‘hercules’,’7sakthi’,’email@example.com’),(‘asha’,’mary’,’firstname.lastname@example.org’);
insert into Virtual_partition_test (first_name,email_id) values (‘vijaya’,’email@example.com’),(‘durai’,’firstname.lastname@example.org’),(‘jc’,’email@example.com’);
Yes, I have created 3 FULL and 3 HALF names .
The above result set, illustrates that the virtual/generated column is working perfectly as expected .
Step 3 – ( Adding the partition key )
It is important to have the partition column as the part of PRIMARY KEY .
cmd : alter table Virtual_partition_test drop primary key, add primary key (id,name_stat);
Step 4 – ( Configuring the partition )
alter table Virtual_partition_test partition by list columns(name_stat) (partition partition_full values in (‘FULL’) engine=InnoDB, partition partition_half values in (‘HALF’) engine=InnoDB);
Partitions has been added as per the requirement .
Hope this blog will help someone who is looking the partitions over the virtual / generated columns .
Galera is the best solution for High Availability, It is being used by many peoples world wide . Galera is doing synchronous replication ( really it is Certification based replication ) to keep update the data on group nodes . In this blog I have explained about “How the Galera replication works?” . For the better understanding, I have made an architecture diagram to describe the replication flow . I have also provided the explanation for the key words which has used in the architecture diagram .
Architecture flow Diagram :
What is writeset ?
Writeset contains all changes made to the database by the transaction and append_key of the changed rows .
What is append_key ?
Append_key registers the key of the changed data by the transaction. The key for rows can be represented in three parts as DATABASE NAME, TABLE NAME, PRIMARY KEY .
If the table don’t have the PRIMARY KEY, the HASH of the modified data will be the part of the writeset .
What is Certification in Galera ?
Certification in Galera will be performed to detect the conflicts and the data consistency among the group . It will be performed before the transaction comiit .
What is CVV ( Central Certification Vector ) ?
CVV is used to detect the conflcits . The modified keys will added in to the Central Certification Vector. If the added key is already part of the vector, then conflict resolution checks are triggered.
Hope this blog will helps someone, who is working with Galera Cluster . I will be come up with my next blog soon .