Showing posts with label MySQL. Show all posts
Showing posts with label MySQL. Show all posts

Sunday, 24 November 2013

MySQL Master Slave Synchronization

In MySQL when Master/slave synchronization brakes, we can start the debugging with show slave status command like the following example illustrates:

[root@my-db-slave ~]# echo "show slave status\G" | mysql -u root -ppass
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000522
          Read_Master_Log_Pos: 50224067
               Relay_Log_File: relay.000005
                Relay_Log_Pos: 10328614
        Relay_Master_Log_File: binlog.000520
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
                   Last_Errno: 1032
                   Last_Error: Could not execute Delete_rows event on table mydatabase.example_table; Can't find record in 'example_table', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.000520, end_log_pos 10330060
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 10328471
              Relay_Log_Space: 259941828
              Until_Condition: None
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Delete_rows event on table mydatabase.example_table; Can't find record in 'example_table', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.000520, end_log_pos 10330060
             Master_Server_Id: 49

On Slave MYSQL_HOME/logs/mysql.err this can be shown as:

131108 18:22:55 [Note] Slave I/O thread: connected to master '',replication started in log 'binlog.000522' at position 50224067
131109  2:26:19 [ERROR] Could not execute Delete_rows event on table mydatabase.example_table; Can't find record in 'example_table', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log binlog.000520, end_log_pos 10330060, Error_code: 1032
131109  2:26:19 [Warning] Slave: Can't find record in 'alf_node' Error_code: 1032
131109  2:26:19 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'binlog.000522' at position 50224067

There are several treatments for this issue that depend on what is the state of binlogs on master and how long the slave has been out of synchronization.
The following were tested on  MySQL 5.5.31 /  RedHat  2.6.32-358.2.1.el6.x86_64

The simple case : Restart Slave 

When synchronization fails between MASTER/SLAVE due to some error, normally the slave retries to reconnect after MASTER_CONNECT_RETRY seconds.
Slave will try to reconnect MASTER_RETRY_COUNT times.

In some cases, when synchronization lost within a small time frame and not many updates have been to the Master, then a simple "stop slave" / "start slave" may restore sync:

On Slave:
[root@my-db-slave ~]# echo "stop slave" | mysql -u root -ppass
[root@my-db-slave ~]# echo "start slave" | mysql -u root -ppass
[root@my-db-slave ~]# echo "show slave status\G" | mysql -u root -ppass

Resetting the Slave (Ignore Master Updates )

Another treatment is to reset the Slave to follow the master in the future: doing this you are lossing all updates from master and you just want slave to ignore all changes happened to the master while slave was out of sync!

On Master
[root@my-db-master ~]#echo "SHOW MASTER STATUS" | mysql -uroot -ppass 

On Slave:
[root@my-db-slave ~]#Take from master the binlog and pos
[root@my-db-slave ~]#echo "slave stop"| mysql -uroot -ppass
[root@my-db-slave ~]#echo "CHANGE MASTER TO MASTER_HOST='my-db-master',MASTER_USER='replication', MASTER_PASSWORD='apassword', MASTER_LOG_FILE='binlog.001006', MASTER_LOG_POS=  102018129;"| mysql -uroot -ppass
[root@my-db-slave ~]#echo "start slave"| mysql -uroot -ppass
[root@my-db-slave ~]#echo "SHOW SLAVE STATUS\G" | mysql -uroot -ppass  

Forcing Slave to get binlog files from Master

Sometimes, it might help to force slave get the binlogs again from the master. This will work if master has not rotate its binlogs: in other words if master have not started writting the binlogs again. That may happened because for example someone issued a RESET MASTER  on master. To force SLAVE retrieve the binlogs from start do the following steps:

On Slave:
Stop slave & Database
[root@my-db-slave ~]#echo "slave stop"| mysql -uroot -ppass
[root@my-db-slave ~]#/etc/init.d/mysql stop

Be sure that you have enough space when you do the next step:
[root@my-db-slave ~]#rm -rf $MYSQL-HOME/relaylogs/* 
[root@my-db-slave ~]#rm -rf $MYSQL-HOME/mysqld-relay-bin.*  $MYSQL-HOME/mysqld-relay-bin.index  $MYSQL-HOME/
[root@my-db-slave ~]#/etc/init.d/mysql start
[root@my-db-slave ~]#echo "stop slave" | mysql -u root -ppass
[root@my-db-slave ~]#echo "reset slave" | mysql -u root -ppass
[root@my-db-slave ~]#echo "start slave" | mysql -u root -ppass

Restore Slave from clear export from Master

Use this when you cannot restore the slave with any of the above ways.

On Master:
I need only mydatabase database, do not care about the others. Even if mydatabase db is in usage from application clients:

Take the dump of current Master database
[root@my-db-master ~]# sudo -i
[root@my-db-master ~]# cd /dbexport/
[root@my-db-master ~]# nohup mysqldump --master-data  -u root  -ppass mydatabase > mydatabase.dump &

With the previous command mysql dumps mydatabase database to the mydatabase.dump by:
Reserving the master binlog  status at the moment of backup transaction
Locking each backuped table in mydatabase, so it corresponds to the binlog state.

Note that: with the previous command I export only one database instance : mydatabase
In case your server hosts more that one databases and those are replicated on different slaves, you should consider taking an export for them also.

Check the last command output and also the dump status:
[root@my-db-master ~]# head mydatabase.dump
[root@my-db-master ~]# tail mydatabase.dump

Put a new password for trasport the dump
[root@my-db-master ~]# passwd

On Slave:
[root@my-db-slave ~]# sudo -i
[root@my-db-slave ~]# cd /monsoon/dbexport/
[root@my-db-slave ~]# scp root@my-db-master:/dbexport/mydatabase.dump .

[root@my-db-slave ~]# echo "show slave status\G" |  mysql -u root -ppass | egrep "Slave_IO_Running|Slave_SQL_Running"

             Slave_IO_Running: No
            Slave_SQL_Running: No

[root@my-db-slave ~]# nohup mysql  -u root  -ppass mydatabase < mydatabase.dump &

[root@my-db-slave ~]#  cat nohup.out

--there must be nothing here--

[root@my-db-slave ~]# echo "start slave" |  mysql -u root -ppass

[root@my-db-slave ~]# echo "show slave status\G" |  mysql -u root -ppass | egrep "Slave_IO_Running|Slave_SQL_Running"
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

Monday, 11 November 2013

MySQL Replication implementation details

The previous and current week I am dealing with MySQL replication.

Currently, I have a master-slave setup, with slave operating in readonly mode.

According to MySQL manual page, replication in slave is implemented using two threads:
  • Thread I/O is polling the Master for binlog updates: this thread polls master for any updates written to its binlog. If master has updates, I/O thread puts them in a queue.
  • Thread SQL reads the updates queue and executes them in the slave.
To assist my study, I created the following pictures illustrating the MySQL replication threads operations. Here we have to state that those operations are for the simple NON-Clustered replication as described above.  
This picture describes Slave I/O replication thread:

And the following one describes the operations of the Slave SQL replication thread:

The basic parameters for replication can be viewed from the MySQL manual here.

Thursday, 7 November 2013

MySQL: "ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)"

Sometimes even the easy things like master-slave resync may break your nerves....

I faced that error while I was trying to resynchronize my master-slave  servers by doing an export from master/import to slave procedure.

This error was due to the fact that slave was started after an automated installation of MySQL with Opscode Chef. Without taking a look to the Chef recipe, the slave server was re-created with nothing:
no users, no databases

 To cut the long story, when I tried login to it I faced this issue:
MySQL: "ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)"

Meaning, even if /etc/my.cnf was set without password:
# By default ask for a password

when I tried to connect I got:
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)


1. Stop MySQL service
/etc/init.d/mysqld stop

2. Start the service without the grant tables:
mysqld_safe --skip-grant-tables &

3. On the MySQL command prompt enter the following commands:

use mysql;
select * from user;

Now here is the interesting point:
if you see root user from the result set of previous select, you just have to update the root password:
update user set `Password`=PASSWORD('some_pass');

If you don't see root user  then issue the following insert command:

More on creating mysql root user you  can see on MySQL user accounts manual page.

Then flush the MySQL tables and leave:
flush tables;

4. Next step is to stop the service that runs without grants tables:
/etc/init.d/mysqld stop
Shutting down MySQL....131107 13:11:38 mysqld_safe mysqld from pid file /var/run/mysqld/ ended
                                             [  OK  ]
[1]+  Done                    mysqld_safe --skip-grant-tables

5. Restart MySQL normally:
/etc/init.d/mysqld start

6. Try to log with the password you previously set:
mysql -uroot -psome_pass
Welcome to the MySQL monitor. ....

Hope it helps!

Tuesday, 5 November 2013

A simple MySQL monitor for detecting InnoDB X-Locks

To monitor blocking transactions and lock waits on MySQL I use the following simple script:

# MySQL Monitoring script
# Use me when you need to log MySQL txns and detect x-locs

while true
 date +'%Y-%m-%d %H:%M:%S,%3N' >> locks.txt
 date +'%Y-%m-%d %H:%M:%S,%3N' >> lock_waits.txt
 date +'%Y-%m-%d %H:%M:%S,%3N' >> tables.txt
 echo "select * from information_schema.INNODB_LOCK_WAITS " | mysql -u root -p$DBPASS >> lock_waits.txt
 echo "select lock_id,Lock_trx_id,lock_mode,lock_table
        from information_schema.INNODB_LOCKS where lock_mode='X'" | mysql -u root -p$DBPASS >>  tables.txt
 echo "select r.trx_id waiting_trx_id,
       r.trx_mysql_thread_id waiting_thread,
       r.trx_query waiting_query,
       b.trx_id blocking_trx_id,
       b.trx_mysql_thread_id blocking_thread,
       b.trx_query blocking_query
   from information_schema.innodb_lock_waits w
   inner join information_schema.innodb_trx b  on
    b.trx_id = w.blocking_trx_id
   inner join information_schema.innodb_trx r  on
    r.trx_id = w.requesting_trx_id;"  | mysql -u root -p$DBPASS >>  locks.txt
 sleep 1 ;

This script creates three files and updates them with a timestamp every second.  Those files are:

locks.txt : shows which locs occurred by showing the transaction ID, thread ID from the blocked transaction and  transaction ID, thread ID from the blocking transaction

tables.txt Shows which transactions have placed an exclusive lock

lock_waits.txt: Shows transactions waiting for a lock over a table

 Use it as follows:

root@my-db-master:~[1402]#./ &
[1] 23184
root@my-db-master:~[1396]#ls -l
total 60
-rw-r--r--  1 root root   216 Nov  5 20:39 locks.txt
-rw-r--r--  1 root root   216 Nov  5 20:39 lock_waits.txt
-rwxr--r--  1 root root  1230 Nov  5 20:38
-rw-r--r--  1 root root   216 Nov  5 20:39 tables.txt

Thursday, 31 October 2013

Digging into MySQL InnoDB Transactions

A week ago I faced a really difficult situation: one of our MySQL innodb databases was slowing down in response times really a lot. With a bit of debugging, I figure out that there were some X-locks (eXclusive locks) that did not allowed some of the later transactions to finish. Here is the procedure I followed:

1. Use MySQL processlist to find out slow operations. The first one show us the slow queries on a specific database:

select * from information_schema.processlist 
              where db='alfresco' and command='Query' order by 'time' desc; 

Since I understood that the specific slow queries are utilizing two tables I also used:

select * from information_schema.processlist 
      where db='alfresco' and info like '%alf_%_assoc%' 

2. Now since we have one idea about heavy MySQL threads, lets take a look to the transactions:
First of all see all transactions that have placed an exclusive lock on one of the tables:
Second find the lock wait transactions

select * from information_schema.INNODB_TRX 
         where trx_id in ( 
             select Lock_trx_id from information_schema.INNODB_LOCKS where lock_mode='X'  )

select * from information_schema.INNODB_TRX where trx_id in (
         select distinct blocking_trx_id 
            from information_schema.INNODB_LOCK_WAITS ) 

3. Finally find the threads along with conflicting transactions with the following query:

SELECT r.trx_id waiting_trx_id,  
       r.trx_mysql_thread_id waiting_thread,
       r.trx_query waiting_query,
       b.trx_id blocking_trx_id, 
       b.trx_mysql_thread_id blocking_thread,
       b.trx_query blocking_query
   FROM       information_schema.innodb_lock_waits w
   INNER JOIN information_schema.innodb_trx b  ON  
    b.trx_id = w.blocking_trx_id
  INNER JOIN information_schema.innodb_trx r  ON  
    r.trx_id = w.requesting_trx_id;
Beware: The thread ID (waiting and blocking) is the ID in the information_schema.processlist

In this case, clearly the transaction with id 194D110F is the blocker!
To inspect the transaction use this query:

select * from information_schema.INNODB_TRX where trx_id='194D110F'

From what we see, there is a transaction running since 19:10:57 with id 194D110F. The query or operation carried out of this transaction is null because it was deleted from the transaction cache since it is very old. To retrieve the information about the SQL statement carried from this transaction we have to use the MySQL binlog.

The transaction 194D110F was observed during a  load test performed on our system. The lock was not released for 2 h.
The locking thread in MySQL was seen as follows from the slow.log:

---TRANSACTION 194D110F, ACTIVE 7431 sec
74 lock struct(s), heap size 14776, 101 row lock(s), undo log entries 64
MySQL thread id 7526, OS thread handle 0x7fd28ec38700, query id 10005877 application-host X.X.44.28 alfresco
Trx read view will not see trx with id >= 194D1110, sees < 194C03CD