Archive

SPE BoK

 The SPE Body of Knowledge

 

How can we benefit from the using a Body of Knowledge approach for Software Performance Engineering. The goals is to address the career path, the organization, the industry practices and to enable you to build a BoK within your company, for the performance engineer. There are five knowledge areas for performance engineering.

This is an overview presentation I gave at the Greater Boston Computer Measurement Group and the National meeting. There are two documents here, the Powerpoint and the Detailed document.  It is a work in progress.

PresentationSPEBoK CMG National V4

Paper with the details of the SPEBoK: The Guide to the Software Performance engineering body of knowledge V4

Please send me your comments..

 

Walter

 

 

Performance testing

Performance testing

Hulk Smash puny web site.

The Healthcare.gov web site was crushed or smashed as soon as it was opened. The top IT people are quoted as saying it was due to the volume and unprecedented number of users accessing the site. They had more than five times the estimated number of people, they predicted 50,000 to 60,000 concurrent users, actually there were 250,000 to 300,000. The total number of people who could use the web site is estimated to be 23 Million, if one percent are on the system, that is 230,000 people at one time. To arrive at the 50,000 to 60,000 user estimate, that would one quarter of one percent (0.25%). That seems a little low, is it more reasonable that 1% or more would access the system that was so widely publicized?

Demand forecasting and scenarios

It is always a challenge to define the workload profile for a new web site. There are user behavior questions like; when are they most likely to access the web site, what is the hourly traffic pattern distribution, how many simultaneous users will there be, what would the peak be, what is normal?  What are the types of user profiles or user categories? What is the possible number of users (23 million)?  What is the average number of users (1%, 230,000)? What if you got 2X or 3X the average?

What will the workload profile be like as the deadline to register gets closer? Will the pattern be like the Batman mask; a spike in the beginning, then it flattens out, and then spike at the end of enrollment.

The next topic to understand is the transaction mix, what are the users doing on the web site? For new web sites, the registration event is a onetime transaction for each user, they will not register again. But for new site, this is the most critical transaction. So, the question becomes, how many simultaneous registrations do you expect?

The Healthcare.gov web site starts off as a router, of the 23 Million people who will visit it, some percentage will be routed to the State Healthcare web site, so, maybe 12 Million will eventually be the user population.

Design goals

Answering these questions setups the design goals and the performance testing goals for your web site. Most web sites and the Healthcare.gov site is not different, must interact with many third parties.  You can communicate to them what your volumetrics are so they can plan as well.

The design goals are translated in the development phase. The developers must know what that critical business transactions are, their response time goals and their throughput (10 TPS).

Performance testing

Preparing to test a system that might have 10 Million, 15 million or more customers is no easy task. The data setup alone is a huge task. The team will need to create the synthetic data to support the performance test. The team will then have to simulate 50,000 or 100,000 users accessing the system. This is where Cloud based performance testing tools can really help.

Production – APM

For any new web site, with a new user profile, you have to be watching the system utilization and the transactions. Application performance management is a critical part of being prepared for such a large system. The tools can provide real-time transaction performance information, measuring the user experience. You can spot slow transactions before it becomes a real issue. The system resource consumption can be closely monitored as the workload increases.  Again allowing you to respond with more resources quickly.

Leverage the Software performance engineering Body of knowledge

What can you do to prepare for a very large web site?  You can take advantage of the Software Performance engineering Body of knowledge and apply each knowledge area to prepare.

  • Software architecture and design
  • Performance testing and validation
  • Capacity planning
  • Application performance management
  • Problem detection and resolution

These five areas work hand-in-hand to prepare your web site.

For large complex systems, a performance data engineer is a must; because, historically, the database and the SQL statements have caused the most trouble when implemented by application architects with limited database understanding . Experience tells us, that when the application team refers to the database as persistent storage there is much opportunity for performance improvement.

This role requires a deep understanding of the physical implementation of the database system. In this role you must understand how the application will use the tables and the access paths for those tables. This insight is needed to define the indexing strategy.  You must be able to create a physical implementation model; where you spread the database tables and indexes across the disk subsystem. This role must fully understand the normalization process and when you must de-normalize for performance. The goal is to minimize joins for very large tables.

In the role, you also provide guidance on writing SQL statements, writing database packages and procedures, provide guidelines on the use of the “hint” statement (don’t do it). Understanding the impact of views is very important. With database views there is a balance to strive for between too many views and too few views.  In most large organization there is a mandatory one view per table, then additional views based on application requirements. Views can hide complexity, however, they can be tough to maintain for new team members if they are too complex.

This role requires the person to be very familiar with the particular optimizer for the database. You  must be able to read SQL plans, know how to access the relevant performance tables and views, understand wait-events.  What are the different types of wait events? What are the top SQL statements?

Market Data

Calculating system momentum – A market basket of transactions and an index

Can we use momentum as a derived value or index to alert us to impending problems with the application or system? Well, the transaction response time is really a byproduct of the workload on the system resources. So, may be a better way to look at it is; does the workload have momentum? Is the workload increasing or decreasing? Borrowing from Physics, momentum is equal to mass times velocity.

We could use transaction complexity to represent mass, we all know that some transactions are heavier than others. However, using response time as velocity really does not work. Instead I could use the transaction arrival rate to represent velocity. Then I could say that the transaction or system momentum is increasing as the arrival rate increases, taking into account the weight of the transactions.

What I am looking for is a communication vehicle to let non-technical people know how the health of the system is.
Momentum is equal to the transaction weight times the arrival rate of the transactions.

I need to pick a rating or scale for my transactions; 1,5,10. Then there is an overall transaction arrival rate and an individual transaction arrival rate. I need the individual transactions in order for the momentum index to have a chance of being relevant.

M = (T1 * T1 TPS) + (T2 * T2 TPS) + (….) or index?

This would be a very custom index for each application. It represents a market basket of transactions. Much list an EFT represents a basket of stocks.
Also, what I want to determine is how quickly the momentum is changing up or down. If I can get the real-time transaction arrival rates, then I can use the momentum to get an early warning of trouble in the system. Another term, might be a volatility index for the application. Can I get the alert in the front-end of the application early and the correlate with all the system resource monitors.

For this I need to borrow from the Financial markets High Frequency Traders. They have tools and techniques that track large amounts of market date in real-time and try to jump in-front of the market momentum. In need to jump in-front of my system momentum.
The faster I can determine that the arrival rate of the heavy transactions is increasing, then I might be able to jump in-front of that and prevent an application or system outage. I need to calculate the rate of change in real time of the arrival rates. I need a to see that at a clock tick at time zero, the arrival rate is 10 TPS and the transaction response time is 300 ms. Then I need another sample at the next clock tick to calculate the TPS is now 11, and the response time is 305 ms. Perfect for using HFT techniques.

Model test case execution

So, what is the difference between a Software Benchmark and a software performance test?
The key difference is the size of the bet the business has placed on the outcome of the project.

There are hardware benchmarks, database benchmarks, and of course the Transaction Processing Council (TPC), with a long list of standard benchmark tests for companies. The Software Benchmark I am referring to in this article it about the custom application benchmark; designed for your particular business or industry. The best way for a business to make a critical bet the company decision is to define, design and execute a Benchmark for its unique workload and transaction volumes.

Who uses a benchmark

The Software Benchmark is needed to determine if the business will use the new application or technology for business advantage. The workload must be well defined, the database must be at production sizes, and the system resource consumption must be clearly monitored. When the benchmark is executed and the question is answered; all the details, the facts, the database demographics, the workload, must be clearly understood by the business decision makers. The Benchmark team must get the test right, in the allocated timeline, if the benchmark takes too long, then you have your answer. The results of the Benchmark typically undergo tremendous scrutiny. A third party is often required to provide the needed visibility and help pass the scrutiny.
There are two categories of companies that undertake a benchmark;
1) Software vendors – the companies that make the software application
2) Consumers of the software – the companies that will use the application for business value

The software performance test is usually for a project or program already underway, the application and technology are already decided. Performance testing is used to make sure the Releases will meet the Service Level Expectations. The Workload for each test may focus on specific parts of the workload and skip others for a given Release cycle. The performance test plan may include;
1) Component testing
2) Duration testing
3) Stability testing
4) Failover and failback testing
5) Also, a round of tuning may be added

Benchmarks for software vendors

The business has decided to move the product in to the large client segment of the market. As such, they need to demonstrate that their application will be able to scale to what their market considers to be large, 10 Million accounts for instance. The application must perform well at this level. The key business transactions must still respond in less than two seconds. The overnight processing must still be able to be completed in the defined window, say 4 hours to complete the billing process.

The bet: Business and revenue growth in the large segment of the market. Repositioning of the company in the marketplace in relation to competitors.

The consumers

There are a couple of scenarios for this category. One is the business is already considered large in the marketplace, they already have 10 Million accounts or more. However the systems in place are older with restricted functionality that is not easily changed. The business needs to add new features to stay ahead of the competition. The second case is a growth plan, where the business believes to increase its market share, it must grow significantly. The business may currently serve 1 Million accounts, but now have a three year business plan calling for a growth plan to 50 Million accounts. They need a technology platform that can scale with them.
The bet for an already large business: Maintain the current business and add new features quickly on a new platform. Stay a market leader, if the new platform fails, you are no longer the market leader.

The bet for a growth business: Easily acquire new accounts and gain market share, or stumble.

Implementation approach

The software benchmark is an event for the business. It is highly visible to Sr. Management, if you are the software vendor, it might be highly visible to your sales pipeline. There may be significant deals waiting on the outcome.
There are typically three to four phases required. Even before that, the organization must review the resources required, people, time, equipment, and budget. The focus on a benchmark can distract already busy people. The developers of the code may not have the time or the skills to design and execute a formal Benchmark. The same for the QA team. A Benchmark may require the use of an external testing lab in order to get the proper configuration. The Benchmark project must be treated as a distinct project.

Critical areas

Business goals and objectives

Clearly state the purpose, to demonstrate the system can safely support the workload of 10 Million accounts. To demonstrate predictable scalability of the application as the workload increases.

Workload profile

In order to gain value from the benchmark, you must have an accurate workload. There will be several user profiles for the online component, with the detailed steps they take through the application. For instance, the casual users, the new users, the power users. There must be a representative batch component as well. As the online purchase transactions will drive the invoice creation process in the batch schedule. There could be a month end process and a quarter end process.

Database size and demographics

The data distribution must be well understood and clearly defined. If you have 10 Million accounts, some are active and some are closed. There may be a residential profile and business profile, with different numbers of details under each. For instance, an insurance policy can have one driver or four drivers, plus the cars. For a web shopping cart application, there can be four years of historical purchases. Your database needs to simulate this.

Performance testing process

The performance testing process must be completely visible and flawless. Generally, you do not have time to rerun a Benchmark. The testing scenarios, test execution, metrics and monitoring must be complete. Many tests may be executed in order to get you ready for the official set of benchmark tests. You need to have a very good test results archive system, because you may not have to completely evaluate the test results, until the end.

Results analysis and executive summary

How do you know you ran a successful test? All the detailed results must be compiled and summarized into an executive view. The virtual web transactions and the batch processing must have detailed results. The Virtual users will record the response time of each request. The you must provide the response time distributions, 50th, 75th, 95th. Plus you must include the transaction per second load. Under 10 TPS, the 95th percentile was 2.5 seconds.
The batch processes must include the rate of processing. The rate for the entire Process, there were 5 million invoices processed in three hours. Also the critical path programs must be clearly identified.

External Vendor lab

Often times, in order to hit the 10 Million account target, the Benchmark requires the use of an external lab. The Benchmark may require a large server or large number of servicer, the database server may need to be large and the data storage may need the newest vendor equipment. The marketing department may have signed a deal with the vendor to use their equipment for a reduced price to use the equipment and be part of a press release.

Archeology

Performance artifacts in development

Where are your requirements and development performance artifacts? Over the years of being a performance engineer, I have been involved in a number of projects related to performance and scalability readiness assessments. This involves evaluating the software, either from a vendor or developed in-house, to determine if it has been designed and developed with performance and scalability goals. During this readiness assessment project, myself and the team I work with, will look for non-functional requirements for the key business and system transactions, and development guidelines and artifacts that track or measure service time during the development and unit testing phase. Finding performance early.

Non-Functional requirements

To start, there are non-functional requirements that should have been defined for the development team. The team develops the code to make the business functions real. The next question is where does your Software development lifecycle and methodology (that’s right, I said methodology) have activities and artifacts specific to performance, scalability, and stability? For example, the application needs a change to the pricing calculation, or order history functions, how fast should it be? Where is it specified that it still needs to be 300 milliseconds after the functional change? Initially the non-functional requirements have specified that the pricing calculation must be completed in 300 milliseconds for average complexity and 600 millisecond for complex calculations. Can you point to the artifact(s) where that is defined in your methodology? Before the developer begins coding, is he or she aware of that?
Then we look for guidelines for developers and services provided by a framework. Has the Performance or Architecture team defined a set of guidelines for the developer to use when building this type of service? Is the use of caching been defined, who verifies the database access and SQL statements are optimal? Where is that captured, what artifacts captures this? Does each developer understand the proper use of logging and code instrumentation, or is it part of the development framework? For the case of the Pricing service, each method must measure service time (internal), and each exposed public service must have a service time measurement.

Continuous Integration

A key artifact to look for is the results from the Weekly or daily build process. Are there test results for the internal method calls and external service calls? Junit will support the internal verification and Jmeter can support the external verification. In order to get value from this, the testing database must be robust (not simply single rows with no history). But, how can you use the response results during development to indicate eventual production performance? The value comes from comparing build to build, for instance, did the service time change radically? This can be an early indicator. However, often times the development environment changes or the database changes. The Performance Engineer must show the business there is value by maintaining consistency in the development environment. With a consistent development environment you can show that the service time of the pricing service has significantly changed, well before production.

Key Performance artifact

For the Jmeter test case: For build 1, the Pricing service is measured at 1.000 second. The goal is 300 milliseconds. Or, what if the service time is 100 milliseconds? Then you need to track the service time from build to build to monitor for consistency. If the 100 milliseconds goes to 1.00 second, how did that happen? Did the environment change, did the developer add new code to the function? You must evaluate this, as you found it early.

Where is the SPE team?

Where is the SPE team?

Where does a Software Performance Engineering team fit into the Enterprise?

Shared services and business units

Many Performance engineering teams end up somewhere, rather than placed somewhere. There tends not to be much in the way of preplanning or discussion with the Sr. IT management team. In my experience teams have been in shared services organizations such as;

  1. Production/Operations,
  2. Quality Assurance,
  3. Development,
  4. Architecture,
  5. Testing Center,

where they have to support many different applications and provide difference services across the Enterprise.  Often times, the business units do not know how to engage with a shared services team and the PE team does not know how to engage the business.

When placed in a Business Unit, this typically means the business is aware the value that a PE team brings. The team has a more Sr. leader who is technical, part project manager, and can interact with the business to ensure the team is aligned.  The PE team will also be involved in more technical and product evaluations.  Each organization influences the goals and purpose of the PE team.  There must be an Enterprise Performance engineer path.

Care and feeding of the core PE team

The high value PE team usually has multi-disciplined people, people with a wider technical skillset that most other team within IT.  A low value performance testing team typically has a narrow set of skillsets, as their mission is to execute performance tests, where they do not understand the system under test very well. A critical part of managing a PE and a PT team is to make sure there is a well defined career path that helps shape the more junior person and helps to extend the technical and non-technical skillset of the more senior people. They should also growth their understanding of the business.

Enterprise architecture

The EA team is involved in many of the Enterprises large scale projects and they usually have a handle on the risks of the large projects. The EA team is involved early in the SDLC, they are technical team that understands the business. When a PE team is placed here, it can have Enterprise wide visibility into the key projects, this allows the team manager to insert the team into the high risk projects and to see the planned projects.

The Director of PE must be a direct report to the leader of the EA team. The goals of the typical EA team are very much in alignment with the goals of the PE team, the EA mission is to translate business requirements to a technical solution for business value. The PE mission is to help manage risk; the risk the applications will not meet the user experience, the risk the applications will not achieve the business goals of scalability, and reliability.

The risk here for the PE organization is the team does not stay involved for the development and testing process.  The PE career path here can be an issue, as the way to promotion is to be an EA.  The EA are involved early and once the design is approved, they are not involved in the detailed development.  The PE team must be involved in the full SDLC.

Quality assurance

Most QA organizations are focused on the functional requirements of the application. They are involved later in the SDLC, however, there is a trend to involve them in the requirements phase to define the tests cases at that time. When a PE team is placed here, it tends to be very testing focused. The project development teams involve the PE team when getting ready to run performance tests. Not during the design and development phases.

The QA team is very focused on the business needs, however they tend not to be technical.  There very few software developers in the QA team.  A high value performance engineering team is technical and business aware. This lack of technical focus within the group will impact the PE team and the PE career path.  The QA career path is different from the PE path. QA can be a manual process in many organization and lack the drive to automate testing.  The PE team needs automation to be successful, as technology changes.

There is a risk here is not enough focus on the technical skillset, too much focus on testing and not enough on design and development, where does the career path lie?

Operations

The goal of Operations is to keep the production applications running smoothly and performing well, where all is well and the customers are happy. The Ops team is the end of the line for the application and they often feel the pain of poorly performing applications, where they have to make it perform.  Often times the development teams do not make the application will performance well, unfortunaltey, response time, scalability, utilization of system resources is the problem of Ops.  They were often not involved at all during design and development.  This is a technical team, often times with a more limited understanding of the business needs.

The operations team put in place monitoring and measurement processes and tools for the applications. The performance team placed in this group usually has a capacity planning and troubleshooting focus. This is because when the application is moved into production, historically it may not have performed well, and the Ops team had to fix it.  They monitor the resource consumption of the production system, they understand the workload of the applications and can determine when it changes.

The performance team in this group will be involved in predicting the expected capacity of the new or modified applications, they evaluate the workload, review performance test results and determine  if additional computing resources are required.  They will have access to the production workload and gain a good understanding on how the application is used.  This information is critical to designing a high value performance test plan.

The risk for the PE team in this group is a reactive focus, and late involvement in the SDLC where they are not aware early enough on the changes to the applications. They may not be involved in the performance testing of the application and have limited access to the results of the testing.  The career path is at risk here as well.

Business unit

Large business units set their priorities and can have large critical development programs that span years.  The IT and Business leadership team have bought into the value that PE brings and how it mitigates risks for the business. These large programs will also have many different Releases, each with added business functionality and complexity. The development team is also part of the business unit.  The business unit may set up the PE team as a shared resource within the BU, to be leveraged across the critical business applications.

When a PE team is placed within the business unit it is viewed as a critical success factor in each major release. The development process in this situation often has PE activities embedded in the SDLC;  non-functional requirements are defined, design reviews, code reviews and unit testing for performance, performance testing, and production monitoring and measuring.  This is a more integrated team. The PE team is very in tune with the business goals and priorities. The PE team leads the PT team as well, there are well defined performance test scenarios and the expected outcome is well known.

The risk in this cases is minimal, however the career path will be maintained outside the program.  There will still be the need for a Enterprise Performance Engineering leader who can manage training and career paths.

Scattered Performance team

In this case, a performance engineering team does not exist. There are key people scattered across the organization who can take on the role of PE when needed.  They can be in all the organizations mentioned previously.  Often times they are brought together (summoned) when there is a critical performance problem.   This structure may be good enough for the business, there may be few and far between production issues related to performance and scalability.

A performance testing center.

The business has pushed for a low cost performance testing team that is typically in a different geography or off-shore. The development team or the QA team will define the test scenarios and test cases to be executed. In this case, you have people who are not performance engineers defining performance test cases for a testing team that does not have a performance engineer. The remote performance testing team often has very little understanding of the application under test. The PT team produces basic performance reports from the testing tool.

In some cases, the PT team cannot execute a test due to a technical issue that they cannot solve. They end up waiting for the next day for the development team to solve it. The value in a PT comes from the ability to understand the application, overcome many technical issues and provide insightful test results. The testing center model requires more involvement from the development team, in some cases so much involvement, the development ends up running the test.