MySQL High Availability- P6

Thông tin tài liệu

data from different sources for reporting purposes. In these cases, you can separate data naturally by storing the writes from each master in its own database, table, or partition. There is no risk of conflict, so it should be possible to use multisource replication. Figure 6-7 shows a slave that replicates from three masters in a round-robin fashion, running a client dedicated to handling the switches between the masters. The process for round-robin multisource replication is: 1. Set the slave up to replicate from one master. We’ll call this the current master. 2. Let the slave replicate for a fixed period of time. The slave will the read changes from the current master and apply them while the client responsible for handling the switching just sleeps. 3. Stop the I/O thread of the slave using STOP SLAVE IO_THREAD. 4. Wait until the relay log is empty. 5. Stop the SQL thread using STOP SLAVE SQL_THREAD. CHANGE MASTER requires that you stop both threads. 6. Save the slave position for the current master by saving the values of the Exec_Master_Log_Pos and Relay_Master_Log_File columns from the SHOW SLAVE STATUS output. 7. Change the slave to replicate from the next master in sequence by taking the pre- viously saved positions and using CHANGE MASTER to set up replication. Figure 6-6. True multisource and a diamond configuration Multisource Replication | 227 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. 8. Restart the slave threads using START SLAVE. 9. Repeat the sequence starting from step 2. Note that in steps 3 through 5, we stop first the I/O thread and then the SQL thread. The reason for doing this and not just stopping replication on the slave is that the SQL thread can be lagging behind (and usually is), so if we just stop both threads, there will be a bunch of outstanding events in the relay log that will just be thrown away. If you are more concerned about executing only, say, one minute’s worth of transactions from each master and don’t care about throwing away those additional events, you can sim- ply stop replication instead of performing steps 3 through 5. The procedure will still work correctly, since the events that were thrown away will be refetched from the master in the next round. This can, of course, be automated using a separate client connection and the MySQL Replicant library as shown in Example 6-17. By using the cycle function from the itertools module, you can repeatedly read from a list of masters in turn. Example 6-17. Round-robin multisource replication in Python import itertools position = {} def round_robin_multi_master(slave, masters): current = masters[0] for master in itertools.cycle(masters): slave.sql("STOP SLAVE IO_THREAD"); mysqlrep.wait_for_empty_relay_log(slave) slave.sql("STOP SLAVE SQL_THREAD"); position[current.name] = mysqlrep.fetch_slave_position(slave) slave.change_master(position[current.name]) Figure 6-7. Round-robin multisource replication using a client to switch 228 | Chapter 6: Advanced Replication Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. master.sql("START SLAVE") current = master sleep(60) # Sleep 1 minute Row-Based Replication The primary goal of replication is to keep the master and the slave synchronized so they have the same data. As you saw earlier, replication offers a number of special features to ensure the results are as close as possible to being identical on master and slave: context events, session-specific IDs, etc. Despite this, there are still some situations that statement-based replication can’t currently handle correctly: • As you saw earlier in this chapter, if an UPDATE, DELETE, or INSERT statement contains a LIMIT clause, it may cause problems if a database crashes during execution. • If there is an error during execution of a nontransactional statement, there is no guarantee that the effects are the same on the master and the slave. • If a statement contains a call to a UDF, there is no way to ensure the same value is used on the slave. • If the statement contains any nondeterministic function—such as USER, CURRENT_USER, CONNECTION_ID—results may differ between master and slave. • If a statement updates two tables with autoincrement columns, it will not work correctly, because only a single last insert ID can be replicated, which will then be used for both tables on the slave, while on the master, the insert ID for each table will be used individually. In these cases, it is better to replicate the actual data being inserted into the tables, which is what row-based replication does. Instead of replicating the statement that performs the changes, row-based replication replicates each row being inserted, deleted, or updated separately, with the values that were used for the operation. Since the row that is sent to the slave is the same row that is sent to the storage engine, it contains the actual data being inserted into the table. Hence there are no UDFs to consider, no autoincrement counters to keep track of, and no partial execution of statements to take into consideration—just data, plain and simple. Row-based replication opens up an entirely new set of scenarios that you just cannot accomplish with statement-based replication. However, you must also be aware of some differences in behavior. When choosing between statement-based and row-based replication, consider the following: • Do you have statements that update a lot of rows, or do the statements usually only change or insert a few rows? Row-Based Replication | 229 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. If the statement changes a lot of rows, statement-based replication will have more compact statements and may execute faster. But since the statement is executed on the slave as well, this is not always true. If the statement has a complex optimization and execution plan, it might be faster to use row-based replication, because the logic for finding rows is much faster. If the statement changes or inserts only a few rows, row-based replication is po- tentially faster because there is no parsing involved and all processing goes directly to the storage engine. • Do you need to see which statements are executed? The events for handling row- based replication are hard to decode, to say the least. In statement-based replication, the statements are written into the binary log and hence can be read directly. • Statement-based replication has a simple replication model: just execute the same statement on the slave. This has existed for quite some time and is familiar to many DBAs. Row-based replication, on the other hand, is comparably new and can po- tentially be harder to fix when replication fails. • If data is different on master and slave, executing statements can yield different results on master and slave. Sometimes this is intentional—in this case, statement- based replication can and should be used—but sometimes this not intentional and can be prevented through row-based replication. Row-based and statement-based replication offer different sets of tricks. Some ways of using statement-based replication to your advantage have been demonstrated in the earlier chapters, and you will see some ways to use row-based replication to your advantage in this chapter. Options for Row-Based Replication Use the following options to configure row-based replication: binlog-format The binlog-format option can be set to use one of the following modes: STATEMENT This will use the traditional statement-based replication for all statements. ROW This will use the shiny new row-based replication for all statements that insert or change data (data manipulation language, or DML, statements). However, statement-based replication must still be used for statements that create tables or otherwise alter the schema (data definition language, or DDL, statements). MIXED This is intended to be a safe version of statement-based replication and is the recommended mode to use with MySQL version 5.1. In mixed-mode replication, the server will write the statements to the binary log as statements, but 230 | Chapter 6: Advanced Replication Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. switch to row-based replication if the statement is considered unsafe through one of the criteria we have discussed in this chapter. The variable also exists as a global server variable and as a session variable. When starting a new session, the global value is copied to the session variable and then the session variable is used to decide how to write statements to the binary log. binlog-max-row-event-size Use this option to specify when to start a new event for holding the rows. Since the events are read fully into memory when being processed, this option is a rough way of controlling the size of row-holding events so that not too much memory is used when processing the rows. Mixed-Mode Replication Mixed-mode replication is recommended for MySQL version 5.1, but the default value for the binlog-format option is STATEMENT. This might seem odd, but that decision was made to avoid problems for users who upgrade from versions 5.0 or earlier. Because those versions had no row-based replication and users have had to use statement-based replication, the MySQL developers did not want servers to make a sudden switch. If the servers suddenly started sending out row-based replication events when they were upgraded, the deployment would likely be a mess. To reduce the number of factors that an upgrading DBA has to consider, the default for this option remains STATEMENT. However, if you use one of the template files distributed with MySQL version 5.1, you will notice the binlog-format option has the value MIXED, per the recommendation. The principles behind mixed-mode replication are simple: use statement-based replication normally and switch to row-based replication for unsafe statements. We have already examined the kinds of statements that can lead to problems and why. To sum- marize, mixed-mode currently switches to row-based replication if: • The statement calls any of the following: — The UUID function — A user-defined function — The CURRENT_USER or USER function — The LOAD_FILE function • Two or more tables with an AUTO_INCREMENT column are updated in the same statement. • A server variable is used in the statement. • The storage engine does not allow statement-based replication, for example, the MySQL Cluster engine. Row-Based Replication | 231 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. This list is, by necessity, incomplete: it is being extended as new constructions are discovered unsafe. For a complete and accurate list, refer to the online MySQL Refer ence Manual. Events for Handling Row-Based Replication In statement-based replication, statements are handled by writing the statement in a single Query event. However, since a significant number of rows can be changed in each statement, row-based replication handles this differently and therefore requires multiple events for each statement. To handle row-based replication, four new events have been introduced: Table_map event The Table_map event maps a table ID to a table name (including the database name) and some basic information about the columns of the table on the master. The table information does not include the names of the columns, just the types. This is because row-based replication is positional—each column on the master goes into the same position in the table on the slave. Write_rows, Delete_rows, and Update_rows events These events are generated whenever rows are inserted, deleted, or updated, re- spectively. This means that a single statement can generate multiple events. In addition to the rows, each event contains a table ID that refers to a table ID introduced by a preceding Table_map event and one or two column bitmaps speci- fying the columns of the table affected by the event. This allows the log to save space by including only those columns that have changed or that are necessary to locate the correct row to insert, delete, or update. Currently, only the MySQL Cluster engine uses the option of limiting the columns sent in the log. Whenever a statement is executed, it is written into the binary log as a sequence of Table_map events, followed by a sequence of row events. The last row event of the statement is marked with a special flag indicating it is the last event of the statement. Example 6-18 shows the execution of a statement and the resulting events. For the example, we have skipped the format description event, since you have seen that before. Example 6-18. Execution of an INSERT statement and the resulting events master> BEGIN; Query OK, 0 rows affected (0.00 sec) master> INSERT INTO t1 VALUES (1),(2),(3),(4); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 master> INSERT INTO t1 VALUES (5),(6),(7),(8); 232 | Chapter 6: Advanced Replication Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 master> COMMIT; Query OK, 0 rows affected (0.00 sec) master> SHOW BINLOG EVENTS IN 'master-bin.000053' FROM 106\G *************************** 1. row *************************** Log_name: master-bin.000054 Pos: 106 Event_type: Query Server_id: 1 End_log_pos: 174 Info: BEGIN *************************** 2. row *************************** Log_name: master-bin.000054 Pos: 174 Event_type: Table_map Server_id: 1 End_log_pos: 215 Info: table_id: 18 (test.t1) *************************** 3. row *************************** Log_name: master-bin.000054 Pos: 215 Event_type: Write_rows Server_id: 1 End_log_pos: 264 Info: table_id: 18 flags: STMT_END_F *************************** 4. row *************************** Log_name: master-bin.000054 Pos: 264 Event_type: Table_map Server_id: 1 End_log_pos: 305 Info: table_id: 18 (test.t1) *************************** 5. row *************************** Log_name: master-bin.000054 Pos: 305 Event_type: Write_rows Server_id: 1 End_log_pos: 354 Info: table_id: 18 flags: STMT_END_F *************************** 6. row *************************** Log_name: master-bin.000054 Pos: 354 Event_type: Xid Server_id: 1 End_log_pos: 381 Info: COMMIT /* xid=23 */ 6 rows in set (0.00 sec) This example adds two statements to the binary log. Each statement starts with a Table_map event followed by a single Write_rows event holding the four rows of each statement. Row-Based Replication | 233 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. You can see that each statement is terminated by setting the statement-end flag of the row event. Since the statements are inside a transaction, they are also wrapped with Query events containing BEGIN and COMMIT statements. The size of the row events is controlled by the option binlog-row-event-max-size, which gives a threshold for the number of bytes in the binary log. The option does not give a maximum size for a row event: it is possible to have a binlog row event that has a larger size if a row contains more bytes than binlog-row-event-max-size. Table map events As already mentioned, the Table_map event maps a table name to an identifier so that it can be used in the row events, but that is not its only role. In addition, it contains some basic information about the fields of the table on the master. This allows the slave to check the basic structure of the table on the slave and compare it to the structure on the master to make sure they match well enough for replication to proceed. The basic structure of the table map event is shown in Figure 6-8. The common header—the header that all replication events have—contains the basic information about the event. After the common header, the post header gives information that is special for the table map event. Most of the fields in Figure 6-8 are self-explanatory, but the representation of the field types deserves a closer look. Figure 6-8. Table map event structure The following fields together represent the column type: Column type array An array listing the base types for all the columns. It indicates whether this is an integer, a string type, a decimal type, or any of the other available types, but it does not give the parameters for the column type. For example, if the type of a column is CHAR(5), this array will contain 254 (the constant representing a string), but the length of the string (in this case, 5) is stored in the column metadata mentioned below. 234 | Chapter 6: Advanced Replication Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Null bit array An array of bits that indicate whether each field can be NULL. Column metadata An array of metadata for the fields, fleshing out details left out of the column type array. The piece of metadata available to each field depends on the type of the field. For example, the DECIMAL field stores the precision and decimals in the metadata, while the VARCHAR type stores the maximum length of the field. By combining the data in these three arrays, it is possible to deduce the type of the field. Not all type information is stored in the arrays, so in two particular cases, it is not possible for the master and the slave to distinguish between two types: • When there is no information about whether an integer field is signed or unsigned. This means the slave will be unable to distinguish between a signed and unsigned field when checking the tables. • When the character sets of string types are not part of the information. This means that replicating between different character sets is not supported and may lead to strange results, since the bytes will just be inserted into the column with no checking or conversion. The structure of row events Figure 6-9 shows the structure of a row event. This structure can vary a little depending on the type of event (write, delete, or update). Figure 6-9. Row event header Row-Based Replication | 235 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. In addition to the table ID, which refers to the table ID of a previous table map event, the event contains the following fields: Table width The width of the table on the master. This width is length-encoded in the same way as for the client protocol, which is why it can be either one or two bytes. Most of the time, it will be one byte. Columns bitmap The columns that are sent as part of the payload of the event. This information allows the master to send a selected set of fields with each row. There are two types of column bitmaps: one for the before image and one for the after image. The before image is needed for deletions and updates, whereas the after image is needed for writes (inserts) and updates. See Table 6-1 for more information. Table 6-1. Row events and their images Before image After image Event None Row to insert Write rows Row to delete None Delete rows Column values before update Column values after update Update rows Event Execution Because multiple events can represent a single statement executed by the master, the slave has to keep state information to execute the row events correctly in the presence of concurrent threads that update the same tables. Recall that each statement in the binary log starts with one or more table map events followed by one or more row events, each of the same type. Use the following procedure to process a statement from the binary log: 1. Each event is read from the relay log. 2. If the event is a table map event, the SQL thread extracts the information about the table and saves a representation of how the master defines the table. 3. When the first row event is seen, all tables in the list are locked. 4. For each table in the list, the thread checks that the definition on the master is compatible with the definition on the slave. 5. If the tables are not compatible, the thread reports an error and stops replication on the slave. 6. Row events are processed according to the procedure shown later in this section, until the thread reads the last event of the statement—that is, an event with the statement end flag set. This procedure is required to lock tables the correct way on the slave and is similar to how the statement was executed on the master. All tables are locked in step 3 and then 236 | Chapter 6: Advanced Replication Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... as MySQL, generally appear on the system as multiple processes When a CPU is under a performance load and contention is high, the system can exhibit very slow performance and even periods of seeming inactivity When this occurs, you must either reduce the number of processes or reduce the CPU usage of processes that seem to be consuming more CPU time But be sure to monitor the CPUs to make sure that high. .. memory to disk and back again is relatively high (it is time-consuming compared to accessing main memory directly), modern operating systems can do it so quickly that the penalty isn’t normally an issue unless it reaches such a high level that the processor and disk cannot keep up with the demands However, the operating system may perform some swapping at a high level periodically to reclaim memory Be... http://library.gnome.org/users/gnome-system-monitor/ Nagios http://www.nagios.org/ Sun Management Center http://www.sun.com/software/products/sunmanagementcenter/index.xml MySQL Enterprise Monitor http://www .mysql. com/products/enterprise/monitor.html We will discuss the MySQL Enterprise Monitor and automated monitoring and report in greater detail in Chapter 13 The following sections describe the built-in monitoring tools... command is to determine the process IDs for a given program For example, if you want to know the process IDs of all of the mysqld programs, issue the command: ps -A | grep mysqld This will send the list of all processes to the grep command, which will in turn only show those rows with “mysqld” in them You can use this technique to find a process ID so you can get detailed information about that process... performance of the paging system This information can be helpful if you are seeing a high number of page faults (major page faults are more costly), which could indicate too many processes running Large numbers of major page faults can cause disk usage problems That is, if this value is very high and disk usage is high, it may not be that the poor performance of the disk is just a symptom of something... yesterday,” he reasoned After a few sips of his latte he remembered something he read about performance monitoring while working on the lab machines at college Joel finished his pastry and reached for his MySQL High Availability book “There has got to be something in here,” he said How do you know when your servers are performing poorly? If you wait for your users to tell you something is wrong, chances are... the CPU time for nice processes gives you the opportunity to see how much CPU these processes are consuming with respect to the rest of the system For example, a high value on this parameter may indicate there is at least one process with too high of a priority Perhaps the best use of the top command is to allow it to run and refresh every three seconds If you check the display at intervals over time,... been consuming available processing time or waiting on I/O For example, if %idle is very low, you can determine that the system was kept very busy Similarly, a high value for %iowait can indicate a problem with the disk If %system or %nice is much higher than %user, it can indicate an imbalance of system and prioritized processes that are keeping normal processes from running 256 | Chapter 7: Getting Started... ensures a single application is performing at peak efficiency, and security monitoring helps you ensure the systems are protected in the most secure manner Monitoring a MySQL server is akin to monitoring an application This is because MySQL, like most database systems, lets you measure a number of variables and status indicators that have little or nothing to do with the operating system However, a database... memory usage over a period of time to ensure you are not observing a normal cleanup operation When periods of high paging occur, it is likely that low memory availability may be the result of a runaway process consuming too much memory or too many processes requesting too much memory This kind of high paging, called thrashing, can be treated the same way as a CPU under contention Processes that consume . SLAVE IO_THREAD"); mysqlrep.wait_for_empty_relay_log(slave) slave.sql("STOP SLAVE SQL_THREAD"); position[current.name] = mysqlrep.fetch_slave_position(slave). the lab machines at college. Joel finished his pastry and reached for his MySQL High Availability book. “There has got to be something in here,” he said.

Ngày đăng: 20/10/2013, 17:15

Xem thêm: MySQL High Availability- P6, MySQL High Availability- P6

MySQL High Availability- P6

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan