Software Developer and Performance Engineer
Posts tagged data
How Many Performance Metrics is Enough?
Sep 3rd
I have been asked this question many times and for me it boils down to one idea.
You can never have too much of a good thing!
Running tests usually takes a lot of time and effort. There is planning, setup, executing the test, capturing data, and then processing that data. Having to rerun a test just to get some data that wasn’t captured before is a lot of work. That is why capturing everything up front can save a lot of time and aggravation.
A word of caution though. Watch the disk space while capturing. Make sure the drive won’t be filled with performance data. That could cause the test to fail and cause you to start all over again. Worse, parts of the system could become corrupted and require a complete reinstall and setup of the test.
So what do I monitor? Everything!
When using typeperf, nmon, JMX, database metric snapshots, or anything else; I try to capture much more than I need, but not so much so that I am capturing every piece of data possible. With typeperf, you can literally collect over 2000 different metrics. When pulling data from metric collection of a database, there can be hundreds of data points. Metrics about table spaces, buffer pools, resources, even down to the individual query.
Later I’ll discuss how I process this huge amount of data.
Managing Testing Data
May 15th
Performance testing has an almost insatiable thirst for data. When executing thousands of tests to put load on servers, each of the tests needs data that is given to the server. In my experience, I have noticed 3 types of data that are needed.
- Unchecked – Values that must be filled out but for the most part are not checked by the server. These values can be randomly generated, selected from a list, or simply entered as fixed data that doesn’t change from test to test. Managing this type of data is very simple and requires relatively little work.
- Checked – Values that are checked by the server and must match for a test to execute properly. These values are different based on a test type, but may be the same every time the test is run. A great way to deal with these types of values is to look them up by test type or name, and a description of the value. Properties files, spreadsheets, or some other simple mechanism can easily accommodate this type of data. The data can still be random, but it must be grouped together so that random selections choose all the pieces of data that go together.
- Consumable – Values which are consumable cannot be used again and are lost once they are used. These values are the hardest to manage because not only do they behave like checked data, they disappear and new data is needed to execute a second test. When trying to put a server under heavy load, lots of unique data will be needed. The best way to manage this kind of data is with a database. The sheer volume of data, and the fact that it needs to be marked as used is best handled by a tool made for the purpose.