For large complex systems, a performance data engineer is a must; because, historically, the database and the SQL statements have caused the most trouble when implemented by application architects with limited database understanding . Experience tells us, that when the application team refers to the database as persistent storage there is much opportunity for performance improvement.
This role requires a deep understanding of the physical implementation of the database system. In this role you must understand how the application will use the tables and the access paths for those tables. This insight is needed to define the indexing strategy. You must be able to create a physical implementation model; where you spread the database tables and indexes across the disk subsystem. This role must fully understand the normalization process and when you must de-normalize for performance. The goal is to minimize joins for very large tables.
In the role, you also provide guidance on writing SQL statements, writing database packages and procedures, provide guidelines on the use of the “hint” statement (don’t do it). Understanding the impact of views is very important. With database views there is a balance to strive for between too many views and too few views. In most large organization there is a mandatory one view per table, then additional views based on application requirements. Views can hide complexity, however, they can be tough to maintain for new team members if they are too complex.
This role requires the person to be very familiar with the particular optimizer for the database. You must be able to read SQL plans, know how to access the relevant performance tables and views, understand wait-events. What are the different types of wait events? What are the top SQL statements?