Get a summary footprint on a MySQL server instance

No comments
Landing on an enterprise with ongoing projects mean that servers are often handed to IT staff without complete knowledge of  what's inside.
I've built a simple script, scraping from here and there, to gather a summary of relevant information.
Once you've gained remote access to the MySQL instance, you can execute the queries to identify the following information regarding the target database server:
  • The host name, what operating system it runs on, the MySQL version installed, default collation of the instance, the installation directory and the data directory;
  • How many user databases it hosts, what they're called and the collation used;
  • The size of each database as a whole and broken down by storage engine;
  • All the tables and their space used ;
  • The creation and last modification on each database.
The script is commented and follows the same order as the above topics:

Running this on my test server, I find the following result for the host, operating system and directory locations:

The following result tells me how many databases there are on the instance:


If relevant, I can choose to see the database names and respective collations:

Then, the script gets the size on each database, in MB:

It breaks that size on the different storage engines:

Still on the size reporting, we get each table's size on records as well as indexes:

Finally, it checks for the creation and update dates of the existing databases to help determine if it's important to have a recent backup of them or not:

This script only gives a summary of the overall instance but has enough details to determine the versions of the software, volume of information and usage of the databases.

Photo credit: tableatny@Flickr

No comments :

Post a Comment

Configuring and testing MySQL binary log

No comments
The binary log contains “events” that describe database changes. On a basic installation with default options, it's not turned on. This log is essential for accommodating the possible following requirements:
  • Replication: the binary log on a master replication server provides a record of the data changes to be sent to slave servers.
  • Point in Time recovery: allow to recover a database from a full backup and them replaying the subsequent events saved on the binary log, up to a given instant.
To turn on the binary log on a MySQL instance, edit the 'my.cnf' configuration file and add the following lines:

#Enabling the binary log
log-bin=binlog
max_binlog_size=500M
expire_logs_days=7
server_id=1

Basically we're doing the following configuration of the binary log:
  • Binary log is turned on and every file name will be 'binlog' and a sequential number as the extension;
  • The maximum file size for each log will be 500 megabytes;
  • The binary log files expire and can be purged after 7 days;
  • Our server has 1 as the identification number (this serves replication purposes but is always required).
Afterwards, restart the service. On my Ubuntu labs server, the command is:
shell> sudo service mysql restart
Now, opening a mysql command line, we can check what binary log files exist:
mysql> show binary logs;
+---------------+-----------+
| Log_name      | File_size |
+---------------+-----------+
| binlog.000001 |       177 |
| binlog.000002 |       315 |
+---------------+-----------+
2 rows in set (0.00 sec)
Let's test if the binary log is working properly using the sample Sakila database. The sample comes with two files:
  • sakila-schema.sql: creates the schema and structure for the sakila database;
  • sakila-data.sql: loads the data into the tables on the sakila database.
Let's first run the database creation script:
shell> mysql -u root -p < sakila-schema.sql
If we manually flush the log file, the database instance will close the current file and open a new one. This is relevant, because we want all the DML operations on the Sakila database to be recorded on a new file:
mysql> flush binary logs;
Query OK, 0 rows affected (0.01 sec)

mysql> show binary logs;
+---------------+-----------+
| Log_name      | File_size |
+---------------+-----------+
| binlog.000001 |       177 |
| binlog.000002 |       359 |
| binlog.000003 |       154 |
+---------------+-----------+
3 rows in set (0.00 sec)
Now, we can run the data script for the sakila database to insert the records:
shell> mysql -u root -p < sakila-data.sql
When it's done, we can check the binary logs to observe the file size increment:
mysql> show binary logs;
+---------------+-----------+
| Log_name      | File_size |
+---------------+-----------+
| binlog.000001 |       177 |
| binlog.000002 |       359 |
| binlog.000003 |   1359101 |
+---------------+-----------+
3 rows in set (0.01 sec)
Next, we can validate the content using the mysqlbinlog utility. By default, mysqlbinlog displays row events encoded as base-64 strings using BINLOG statements. To display actual statements in pseudo-SQL, the --verbose option should be used:
shell> sudo mysqlbinlog --verbose /var/lib/mysql/binlog.000003
Using the verbose option, the following commented output is produced from our log, one per each statement:
### INSERT INTO `sakila`.`country`
### SET
###   @1=1
###   @2='Afghanistan'
###   @3=1139978640
So, all the statements we executed since opening the 'binlog.000003' are registered there and can be replayed to produce the same set of changes on a given target.



No comments :

Post a Comment

This is why you're not better off with a commercial database

No comments
When tackling a new enterprise project to support a given business, you face the challenge of choosing and committing to a database platform. The choice should be the one most adequate, given the needs and requirements of the new information system and data to be hosted and managed.
Typically, a number of factors should be taken into consideration like security features, storage requirements, reliability, high availability, backups, disaster recovery, data compression, technical support and last but definitely not least, the cost of the solution. Added to that there is also performance, scalability and ease of administration to think about.
With the result of this analysis, most of the time, the verdict is this: data platforms available as community editions or free open source fall short on the given requirements fulfillment. So, the advice is almost always to acquire commercial licenses or expand the licensing already owned.
And this should give you peace of mind for a while. At least until the first release of the system goes live. After that, some of the common pitfalls are:

  • Security permissions were not exhaustively identified for all database objects. To solve things quickly, you turn your database authorization management into Swiss cheese;
  • The new system has issues, until the bugs are fixed, manual correction scripts have to be executed on working hours, maiming overall business activity;
  • As the data volume grows, there is performance degradation due to inefficient indexing, bad user experience design or poor database coding skills;
  • Technical support provided by the database vendor performs an audit on the workload, does some tuning on the server instance, and shifts responsibility on the remaining lack of performance over to the development team;
  • The development team struggles adopting the database vendor recommendations as it has great impact on the source code;
  • Management wants high availability, but it won't commit the infrastructure resources and budget to set it up properly;
  • You do not have a remote site so that a disaster recovery plan can be made, you don't have a lab where you regularly restore backups and perform automated integrity checks;
  • You are understaffed and with no one possessing deep skills on the specific data platform you own;

Even if just a third of these pitfalls sound familiar, what are you doing with your next project? Still thinking on recommending commercial software because people are the ones to blame here?
On a global organization, after you deploy the first release and spread it across the offices, the licensing and support costs will skyrocket. That money could be spent preventing some of the pitfalls mentioned here. If you cut back on licensing and support, you can spend on infrastructure and staff.
There are wonderful commercial databases out there, but on the business requirements phase, the pick should be done as a whole and not based on vendor promises because the final solution will be a result of development and available budget, not a sales brochure.
Vision and engineering are the keys to success. And I'm afraid that doesn't come out of the box.

No comments :

Post a Comment