Sunday, July 31, 2011

Week 10 report

This was a really interesting weekend. I've been digging through my newly discovered MySQL tuning tools most of the time, and discovered 2 great ideas that I could include to my monitor.

First is the mysqltuner 2.0 ruleset that I already mentioned, which I plan to add this week.
And the other is mk-query-profiler from the maatkit command line tools. It uses 'SHOW SESSION STATUS' to read lots of very interesting data about single executed queries. This would be a wonderful addition to my 'Query Analyzer Dialog'.

What I made this week:
I started out realizing that the slow query log loading didn't work at all, so I had to do some bug fixing there. Having the slow log working again I noticed that I forgot to implement grouping/summarization of the slow query columns such as examined_rows, sent_rows, lock_time etc. It took quite some code rearrangement to fix this.

And I kept finding new bugs that I had to fix first. One for example was that the query profiling information in the 'Query Analyzer Dialog' was recorded for the wrong query.

In the remaining time of the week I implemented
- chart config import and export using json as data format
- chart title and series label editing dialog
- a small look into drizzle support. With a small patch most of the status page now works under drizzle (some values don't exist and return php notices though)
- a fix for cpu charts not displaying anything
- error handling when the pma session runs out. It now displays a dialog asking the user to reload the page
- in the 'query analyzer dialog' it now displays all explain outputs (there are multiple explain rows when multiple tables are used)
- added a check for mysql versions that don't support table logging (< 5.1.6)

So lots of small needed fixes and adjustments, plus a unexpected power outage on friday that lasted most of the day kept me from writing actual code for the advisory system.

What I will do next week:
- Advisory system using mysqltuner 2.0 ruleset

What I still would like to do for phpMyAdmin, though not mission critical for my project:
- Display chart from display table page: Handle big amounts of data with zoomable charts
- Debug server variables table width auto-adjustment (doesn't always adjust properly)
- Fix UI of create table dialog / enum editor
- Status Monitor: Add disable general_log + truncate table link to setup dialog
[Edit 01.08.11]
- One more great idea: For the Monitor charts: Allow charting over calculated variables, like the query cache hitrate (Hit rate = Qcache_hits / (Qcache_hits + Com_select))
[Edit 02.08.11] I never run out of ideas -_-
- Fix the query analyzer sql formatter to not fail on subqueries or on occurring strings like "this is where the formatter fails"
- Add a 'Search' - link to the monitor tab allowing one to make searches over slow log / general log and their EXPLAIN's (e.g. find all queries using temporary tables)

Saturday, July 30, 2011

The extinction of the MySQL Tuning tools

On my journey of implementing tuning/optimizing tools for phpMyAdmin I've came across many already existing open source tools in that area.
I'm starting to realize though, to my surprise, that most of these tools haven't been updated or gained new features for a year or more. You rarely get to read recent blog postings about variable tuning either.

The most well known tools are:
  • mytop - Like the *nix top but for mysql threads. Last release is from Feb 2007
  • mysqlreport - '...makes a friendly report of important MySQL status values'. Last release on April 2008 and official statement of no longer being maintained.
  • mysqltuner - makes tuning recommendations. Last release from April 2011 but from what I could see only bug fixes and cosmetic changes since it's first release in 2008
  • mysqltuner 2.0 - a misleadingly named mysqltuner fork by Sheeri K. Cabral. Last release in April 2010. Was only active for 5 months.
So I'm quite baffled that I seem to be the only one who's currently working on a open source MySQL tuning tool, despite the fact that MySQL is getting quite popular nowadays. I really wonder why so little progress is being made in this direction. Though, I'd love to hear that this is not the case and there's is some recent tools I just haven't found.

Either way, while looking at the tools, I saw that mysqltuner 2.0 uses a comprehensive set of rules to build the recommendations. Since it has been build by a database professional, actually quite a celebrity in the db world, from what I could read, I will probably use it as the basis for my suggestions/advisory system. Gladly the GPL-License allows me to do so :)
This is very useful since it will save me a lot of work "inventing" the recommendation rules, for which you need a lot of MySQL experience.

Also I'm the only pma gsoc student that posts project related stuff which is not a weekly report! :(

[Edit 31.07.11] Ok, I just ran into all the more recent tools that I failed to find, the most extensive one being maatkit. Definitely gonna check out that one!

Sunday, July 24, 2011

Week 9 report

This week did quite turn out as I planned to but ended with good results nonetheless.
An unusually big amount of time I had to invest in tracking down a series of bugs.
For the CSS Sprites I built last week, I had to rewrite PMA_linkOrButton() which is a core function of phpMyAdmin, so that took quite a while as well. But in the progress I was able to fix a bug and make the code cleaner.
I still didn't find the time to improve the settings import/export and IE compatibility.

But apart from that I have built a 'Query analyzer' Dialog that is reachable through clicking 'SELECT' - Statements in the Log Table. It allows one to get profiling and 'EXPLAIN' information about that query. I have also built a very simple Javascript MySQL Formatter do display the query nicely in an editable area, so one can also tweak the query and check for execution time changes.

And the last addition was another possibility to filter the log data. If enabled, it removes variable data in WHERE statements with regexp magic, then regroups the queries. So for example if you have a user table that has a name and user id column and your code does hundreds of queries to get the name of different user ids, these queries will be grouped together into one.

And finally I have now a concrete idea on how to build the Advisory/Suggestion system.
- I will add a few advisories inside the query analyzer dialog, using the profiling and explain information.
- On the 'Status/Server variables' Tab I will add a button that allows the user to start a variable analyzer that throws out suggestions for the server variables based on the values of the status variables.

Once that is done, my GSoC project is completed :-)

What I made this week
- 'Query Analyzer' - Dialog. Pretty prints the query, shows profiling information, shows EXPLAIN information
- Filter: Group queries, ignoring the variable data
- Lots of bug fixing
- More work on CSS Sprites, also now documented on the wiki

What I will do next week
- Hopefully get to the import/export part
- Advisory system
- [Edit 26.07] I forgot that I still need to implement the chart editing feature. It should allow one to rename the chart title and series.
- [Edit 28.07] I also forgot that I still need to fix the appearing empty spaces in the chart grid when changing columns count

Sunday, July 17, 2011

Week 8 report

This week I've done a good portion on finalizing the monitor, such as chart ordering and i10n. After that I've invested a considerable amount of time in replacing the big amounts of icon images in phpMyAdmin with CSS Sprites, since they improve page loading speed and allow for better style customization.

Since the Monitor is now not far from finished, I can start focusing on some type of Advisory/Suggestion System that can give the user hints for optimizing their configuration and server queries. I will need to put together a plan on how to infer those advisories and a proper visualization of them. For that I've already done some research of whether I can solve that with a business rule engine - as it would be just wonderful to have all these advisories/suggestions getting inferred from a set of well organized rules. To my disappointment there seems very little available for PHP. But I will keep looking.

What I made this week:

- Disable CPU and Memory graph for database server that are not local(-host)
- i10n
- Wrote a jQuery Plugin for convenient element reordering in tables, since I couldn't find any good solution for that: http://tyron.at/tmp/ts.html
This allows for easy reordering of charts now.
- Replaced chart size settings with amount of columns. The charts automatically size to the dimensions of the viewable area.
- The chart ordering is now saved as well. It will keep its order after a reload/restart of the browser.
- Improved usability of the returned log data table by adding a 'total sum' row at the bottom and adding a way of filtering the query texts. So now one can just type a table name and you get to see how many queries have been executed on that table (not pushed to my repo yet).
- I've spent a lot of time to add CSS Sprites to all of phpMyAdmin to replace the big amounts of single icon files currently being used. In a few tests I noticed a reduction of 20-30% of GET requests on a page load, resulting in the page being loaded faster. But it will also make it easier to customize themes and will make the pma code more readable.

What I will do next week:

- Still need to work on that settings import/export & IE compatibility
- Some feature that allows you get profiling information from the queries in the returned log data
- Check out what the possibilities are for displaying statistics or list over server queries that failed with an error and whether it would be of any use to read from the log files instead of the log table.
- Flesh out a concept for the Advisory/Suggestion system.

Monday, July 11, 2011

Under attack

Sigh, I'm currently trying to reduce the damage caused by a DDoS attack on my server.
A hacker has found a weak spot in one of my oldest websites (the PHP code I wrote for it is like 5-8 years old). I've taken the site offline now, nonetheless the flood of requests still seems to slow down the server. But at least it responds again.

The hacker has abused the fact that the comments list on that site has no limit in how much shall be displayed, so the hacker completely overloads the MySQL Server by flooding requests to load all 51000 Comments. Most of you probably gonna slap your forehead when you read this, I do too ;) But this website was made when I started with PHP and serves only a really small gaming community. I have left that community years ago, but left the site online. And the code remained mostly untouched since then. I will need to take it down permanently some day, and let that collection of really bad PHP code rest in peace.

By the way the first indication of a DDoS Attack I found through my newly programmed Monitor Tab in phpMyAdmin which showed me that some queries have been executed several thousand times. So thats an epic win right there :D

Looking at the logs, the attack is truly distributed. I checked some of the IPs with reverse DNS -the results: Saudi Arabia, Brazil, one ip from Amazon Web Services, China, Romania, ....
Might generate some statistics over all these ips once the attack is over, that would be awesome to have.

[Edit:]
Mini-Statistics: (counting from like 1 hour after the attack has started)
Amount of DDoS Attacks: 202.299
Amount of IPs from where the attacks come from: 1.106

Sunday, July 10, 2011

GSoC Project LoC

Since I love making statistics, I checked very roughly how much lines of code I've added to the phpMyAdmin codebase so far. This does not include a bunch of smaller patches and small code changes, and of course a lot of it is newlines and comments:

functions.js : ~100 LoC
server_status.php: ~1000 LoC
server_status.js: 1380 LoC
server_variables.js: 160 LoC
tbl_chart.php: 140 LoC
tbl_chart.js: 240 LoC

Total: 3020 LoC

Week 7 report

I've been making good progress on the 'Monitor' Tab. Now the user has the ability to select a time span in any chart, which will load intelligently grouped data from the logs.
Also the current configuration is now saved in the browsers local storage (however, currently it doesn't save how the charts are ordered).
I'm quite happy how the monitor turned out so far, though there's still a lot of rough edges that I will have to even out ;-)

Regarding intelligently grouped data:
Currently, selecting log data does the following (more will come):
- Returns only INSERT, UPDATE, DELETE, SELECT
- Groups together queries that are identical and therefore adds a 'count' column to the result table.
- Truncates long INSERTs and UPDATEs (above 300 chars)
- If selected, groups together INSERTs into the same table.
- Also creates some statistics what query types have been found in this time range

What I made this week:
- Linux CPU / Swap Usage are now correct
- Lots of UI work on for selecting data from the logs. The user can now select a time span in any chart that will load the logs from the general_log or slow_log in that given time span.
- Play/Pause button
- An Instructions/Setup dialog that contains useful information on how to use the Monitor, as well as a bunch of links that allow the user to configure the server to start logging.
- A first implementation of saving the settings to the browsers storage (localStorage)
- A lot of other small UI improvements (e.g. edit mode, better chart tooltip)

What I will do next week:
I will have a lot of small things to do for the Monitor:
- Take into consideration that not every server is on localhost
- Test Monitor without super user rights
- Proper l10n
- A better "chart grid". I think one should be able to select how many 'columns' of charts he wants to have. Maybe some autosize to window width would be useful too.
- More ways to save the monitor settings
- IE compatibility
- Import/Export config
- Save how the charts are ordered

- More ways to extract the data you need for optimizing server/queries
- Generally improve the Monitor to make it more usable for query optimization. Need to research on this a bit more.

If some time is left, I will improve some other parts of pma on which I have wrote a lot of suggestions on the mailing list (i.e. create table dialog).

Saturday, July 2, 2011

Week 6 report

This week I made some more progress on the new 'Monitor' tab. But I was also looking up on other query/server optimizer tools, to see how they solved it.
There's is 2 very good presentations from Oracle and the Pythian Group that I recommend watching to anyone who is interested in this topic.

What I made this week:
- A bit of more research how other query optimizer tools are built
- In the 'Monitor' Tab the charts can now be removed again and the UI is a bit better
- For all the charts, the chart size and refresh rate is changeable (this included building a tiny popup box)
- I've implemented cpu, memory and swap usage for windows and linux, however cpu and swap for linux still seem to display incorrect values. Some calcuation must be wrong or a name mistyped, I couldn't figure it out yet.

What I'm going to do next week:
- Fix cpu and swap chart for linux
- Below the charts will be starting to build a UI for selecting data from the slow query log and general log
- Some sort of play/pause button
- When there's time left: Safe the charts to the browser localstorage (next version will add saving to db)