BACKTRACE

(condense)

Back to Posts List

GET

MySQL replication and Rails

Replication enables data from one MySQL database server (called the master) to be replicated to one or more MySQL database servers (slaves). Replication is asynchronous – your replication slaves do not need to be connected permanently to receive updates from the master, which means that updates can occur over long-distance connections and even temporary solutions such as a dial-up service. Depending on the configuration, you can replicate all databases, selected databases, or even selected tables within a database.

As your application grows, you should consider spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves.

Setting up MySQL replication

Create Replication User on the master

Login to your the your master database console and create a user for the slave to access the master’s changes.

CREATE USER ‘replicator’ IDENTIFIED BY ‘password’

then, apply the permissions required to access the master from the slave:

GRANT REPLICATION SLAVE ON . ’replicator’@’%.mydomain.com’ IDENTIFIED BY ‘password’

The % basically means that the user replicator will be able to access the master database from every subdomain on your domain, as long as it’s set on your /etc/hosts.

Setting the Master Configuration

First, we must enable binary-logging and give the master a unique id (integer). Open /etc/mysql/my.cnf and add these following lines:

[mysqld]
log-bin=mysql-bin
server-id=1

Note that the server-id value can be any integer value. Two important things to pay attention to:

  • For the greatest possible durability and consistency in a replication setup using InnoDB with transactions, you should use innodb_flush_log_at_trx_commit=1 and sync_binlog=1 in the master my.cnf file.
  • Ensure that the skip-networking option has not been enabled on your replication master. If networking has been disabled, then your slave will not able to communicate with the master and replication will fail.

You’ll have to restart your slave server in order this changes will take affect.

Setting the Slave Configuration

The only thing you need to do on the slave is to add a server-id just like we added on the master’s my.cnf but a different, unique id:

[mysqld]
server-id=2

You’ll have to restart your slave server in order this changes will take affect.

Determine the Master replication information

This is the bitchy part, you must determine the master’s current point within the master binary log. You will need this information so that when the slave starts the replication process, it is able to start processing events from the binary log at the correct point.

Why? you ask?

If you have existing data on your master that you want to synchronize on your slaves before starting the replication process, then you must stop processing statements on the master, obtain the current position, and then dump the data, before allowing the master to continue executing statements. If you do not stop the execution of statements, the data dump and the master status information that you use will not match and you will end up with inconsistent or corrupted databases on the slaves.

AND YOU DON’T WANT THAT TO HAPPEN, TRUST ME.

So, first thing we have to do, is to stop processing statements on the master. Log in to your database command line and invoke:

FLUSH TABLES WITH READ LOCK;

Rails related tip: any database action will be held at the beginning of the transaction.. you might see your log halting on something like:

Processing DashboardController#show (for 127.0.0.1 at 2009-07-26 22:43:15) [GET]
Parameters: {"action"=>"show", “controller”=>"dashboard"}
User Load (0.4ms) SELECT * FROM `users` WHERE (`users`.`id` = ‘23’) LIMIT 1
SQL (0.1ms) BEGIN

This action and all other pending db actions will not proceed until you will close the command line where you invoked the FLUSH TABLES you did before, so keep it open until you get all the data you need from the master.

Now enter in the command line SHOW MASTER STATUS;:

mysql > SHOW MASTER STATUS;
-—————————————————————————————-
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
-—————————————————————————————-
| mysql-bin.001 | 14 | test | manual,mysql |
-—————————————————————————————-

The File column shows the name of the log file and Position shows the offset within the file. In this example, the binary log file is mysql-bin.001 and the offset is 14. Record these values. You need them later when you are setting up the slave. They represent the replication coordinates at which the slave should begin processing new updates from the master.

If the master has been running previously without binary logging enabled, the log name and position values displayed by SHOW MASTER STATUS or mysqldump --master-data will be empty. In that case, the values that you need to use later when specifying the slave’s log file and position are the empty string (’’) and 4.

At this point, if you are setting up a new pairing and there’s no data in the Master, you can exit the client and release the locks on the tables.
If you do have data on the master, you need to sync that data first, so don’t quiet the client just yet.

Initial Sync between master and slave

If you accidently (grrr) exited the console where you invoked the FLUSH TABLES statement, go back and do the previous step again, you deserve it.

If you didn’t (good boy/girl), open another command line (OS, not mysql) on your master and invoke:

shell> mysqldump —user=user —password=password —all-databases —master-data | gzip > masterdump.sql.gz

This will create a new little archive for you to transfer to the slave faster.
When you dump with --master-data the dump will also include a statement that directs the slave to the master (CHANGE MASTER TO).
Now you can go back to your mysql console and either quit it or invoke:

UNLOCK TABLES;

If your database is extremely large, dumping the SQL is not the fastest / smartest thing to do. Creating a snapshot using raw data files would probably make more sense.

Import Master data into slave

First, start the slave (if it’s already up, stop it) with the --skip-slave option:

shell>> /etc/init.d/mysql start —skip-slave

Now import the dump you created before (assuming that you transfered before):

shell>> mysql < masterdump.sql

Set the Master configuration on the slave

Fire up your slave database console and invoke the CHANGE MASTER statement:

mysql> CHANGE MASTER TO
→ MASTER_HOST=‘master_host_name’,
→ MASTER_USER=‘replication_user_name’,
→ MASTER_PASSWORD=‘replication_password’,
→ MASTER_LOG_FILE=‘recorded_log_file_name’,
→ MASTER_LOG_POS=recorded_log_position;

the MASTER_LOG_FILE and MASTER_LOG_POS values should be values you got from invoking MASTER STATUS on the beginning of this process.

Start the slavery again!

Invoke

mysql> START SLAVE;

Setting Up Rails

Once we setup the replication on the database layer, we have to tell rails to seperate INSERT,UPDATEand DELETE from the SELECT statementes, and forward them to the master databse that will be responsible for writing data in.
SELECT statement would be directed to the slave(s).

Masochism

Masochism (or here) is a simple and clean replication manager for Rails. Masochism enables you to set one slave and a master.

setup is pretty simple:

# default configuration (slave)
production: &defaults
adapter: mysql
database: app_production
username: webapp
password: ********
host: localhost # or where ever the slave is.

  1. setup for masochism (master)
    master_database:
    <<: *defaults
    host: master.example.com

and to launch it on production, just create an initializer in your config/initializers:

config.after_initialize do
if Rails.env.production?
ActiveReload::ConnectionProxy::setup!
end
end

and if you run passenger, you should add this to your application_controller.rb:

# in ApplicationController
prepend_before_filter do |controller|
unless ActiveRecord::Base.connection.is_a? ActiveReload::ConnectionProxy
ActiveReload::ConnectionProxy.setup!
end
end

Take a look at the README file if you are using ThinkingSphinx or need some extra setup options.

The major fault back to this plugin is that does not support more than one slave.. on the other hand, you can always install a cluster and have only 2 hosts (master and slave) with endless database behind.

data_fabric

data_fabric by FiveRuns provides flexible database connection switching for ActiveRecord, which does not only solve the replication on the application level, but also enables sharding.
Sharding is the process of splitting a dataset across many independent databases. This often happens based on geographical region (e.g. craigslist) or category (e.g. ebay).

Read the readme file. and check out the example database.yml

master_slave_adapter

master_simple_adapter (by @mauriciojr attempts to give out the same benefits as masochism does, only with supea-fly magic and voodoo.
I haven’t had the chance to use it, but it seems to be clean and simple, judging by the code.

Final thoughts

I suggest that you’d consult a DBA before building a database infrastructure on your own, there’s a lot to be done that was not covered by this tutorial and is known to those who do it on a matter of their job.

Useful Links

That’s it! hope it helps.

POST

blog comments powered by Disqus

I Don't have cookies.

ELAD ENV

Variable Value
LINKEDIN
TWITTER
FACEBOOK
GITHUB
WWR
IRC
{ 'irc.freenode.net' => [ '#rubyonrails', '#railsbridge', '#ruby', '#mootools' ]}
SKYPE
eladmeidar

You're seeing this error because I think it is funny.