Understanding Query Processing
The query processor processes queries that you specify. The processor yields highly efficient query plans that execute using minimal resources and ensure that results are consistent and correct.
To process a query efficiently, the query processor uses:
· The specified query
· Statistics about the tables, indexes, and columns named in the query
· Configurable variables
To successfully process a query, the query processor must execute several steps across several modules:
· The parser converts the text of the SQL statement to an internal representation called a query tree.
· The preprocessor transforms the query tree for some types of SQL statements, such as SQL statements with subqueries and views, to a more efficient query tree.
· The optimizer analyzes the possible combinations of operations (join ordering, access and join methods, parallelism) to execute the SQL statement, and selects an efficient one based on the cost estimates of the alternatives.
· The code generator converts the query plan generated by the optimizer into a format more suitable for the query execution engine.
· The procedural engine executes command statements such as create table, execute procedure, and declare cursor directly. For data manipulation language (DML) statements, such as select, insert, delete, and update, the engine sets up the execution environment for all query plans and calls the query execution engine.
· The query execution engine executes the ordered steps specified in the query plan provided by the code generator.
How locking affects performance
Adaptive Server protects the tables, data pages, or data rows currently used by active transactions by locking them. Locking is a concurrency control mechanism: it ensures the consistency of data within and across transactions. Locking is needed in a multiuser environment, since several users may be working with the same data at the same time.
Locking affects performance when one process holds locks that prevent another process from accessing needed data. The process that is blocked by the lock sleeps until the lock is released. This is called lock contention.
A more serious locking impact on performance arises from deadlocks. A deadlock occurs when two user processes each have a lock on a separate page or table and each wants to acquire a lock on the same page or table held by the other process. The transaction with the least accumulated CPU time is killed and all of its work is rolled back.
Understanding the types of locks in Adaptive Server can help you reduce lock contention and avoid or minimize deadlocks.
To help reduce lock contention between update and select queries:
· Use datarows or datapages locking for tables with lock contention due to updates and selects.
· If tables have more than 32 columns, make the first 32 columns the columns that are most frequently used as search arguments and in other query clauses.
· Select only needed columns. Avoid using select * when all columns are not needed by the application.
· Use any available predicates for select queries. When a table uses datapages locking, the information about updated columns is kept for the entire page, so that if a transaction updates some columns in one row, and other columns in another row on the same page, any select query that needs to access that page must avoid using any of the updated columns.
Sybase P n T FAQ
1. Define two ways of measuring performance.
Ans. Response time and throughput.
2. What are the four (4) major phases of Adaptive Server Optimizer?
Ans. Parse, normalize, preprocessing, and optimize.
3. What is optdiag used for?
Ans. ASE utility used to read, write, and simulate statistics.
4. What is the name of the GUI tool provided with the product that can capture abstract plans?
Ans. Planviewer
5. What are the three ways that table data can be accessed?
Ans. A table scan, a nonclustered index scan, or a clustered index scan.
6. How can you avoid last page contention on an APL heap table when many inserts and/or updates are made simultaneously?
Ans. Create a clustered index on the column that provides the most “randomness.”
7. Which is usually faster — a nonclustered index that covers the query or a clustered index?
Ans. A nonclustered index or a DOL clustered index that covers the query is usually faster than an APL clustered index.
8. In which system tables are optimizer statistics stored?
Ans. systabstats and sysstatistics
9. What command do you use to flush statistics from memory?
Ans. sp_flushstats <object_name>
10. What happens if housekeeper free write percent is set to zero?
Ans. The housekeeper does not run any of its tasks and so statistics are not flushed to systabstats.
11. When can the Query Optimizer use an index to provide results for a query?
Ans. The query optimizer can use an index only when a query references the indexed column(s) in the where clause.
12. What is the meaning of Join Transitive Closure (JTC)?
Ans. Join Transitive Closure (JTC) adds possible joins that are not explicitly stated in the where clause.
13. What is meant by the term "Index Covering"?
Ans. Only used with nonclustered indexes on APL tables and either clustered or nonclustered indexes on DOL tables (as DOL clustered indexes have a leaf level above the data level). It refers to the condition where the index pages contain all the necessary data to proved results of a query, therefore, Adaptive Server does not need to access the data pages.
14. Which of the three optimization goals is the default for ASE 15.0?
a. allrows_oltp
b. allrows_mix
c. allrows_dss
Ans. a. allrows_mix
15. How can you display the details of missing statistics from join columns?
Ans. set option show_missing_stats will display the details of missing statistics from join columns
16. During query plan optimization, what settings does the optimizer look at to determine the total number of worker processes During query plan optimization, what settings does the optimizer look at to determine the total number of worker processes available for the query?
Ans. When the optimizer compiles a query plan, it looks at max parallel degree, max scan parallel degree, and the parallel clause of the select statement to determine the total number of worker processes available for a query.
17. When is an adjusted query plan created and how does it differ from the optimal query plan?
Ans. An adjusted query plan is created when the number of worker processes specified in the optimal query plan is not available at runtime. The adjusted query plan differs from the optimal query plan in that it contains a smaller number of worker processes and it may use serial access instead of parallel access for some or all of the tables in the query.
18. What command would you use to run all queries in the current session in serial mode?
Ans. set parallel_degree 1
19. How can you see whether runtime adjustments are occurring?
Ans. Using the set process_limit_action command, showplan, or sp_sysmon.
20. What type of updates are available to the optimizer?
Ans. In-place, cheap, expensive and deferred
21. With respect to data placement, how do APL clustered indexes function differently from DOL clustered indexes?
Ans. APL clustered indexes maintain the table in index key order. As new rows are inserted, they must be inserted in their correct index position. DOL Clustered indexes are placement indexes. If the row fits on in its appropriate location, it will insert in index key order, otherwise it will be inserted into the page with the next available space to hold the row. This cause the DOL table to end up out of index key order.
22. Are APL data page deallocations handled differently from DOL data page deallocations? If so, how?
Ans. Yes, when the last row is deleted on an APL data page, the page is deallocated and returned to the list of available pages for the table. DOL deletes are logical therefore, when the last row is deleted from the page it is not physically removed and the page will not be deallocated until the housekeeper GC performs garbage collection or reorg is executed against the table.
23. How can you avoid deferred updates?
Ans. You can avoid deferred updates by avoiding updates and deletes through joins, having an unique index somewhere on the table and using a different column in the where than the column being changed in the set clause.
24. What is the meaning of the value in the Num_Reuse column of the sp_monitorconfig report for the ‘procedure cache size’ parameter?
Ans. The number of query plans that have been removed from the cache to make space.
25. In ASE 15.0, what is the option used to control Run Time Adjustment behavior?
Ans. set process_limit_action quite|warning|abort
26. What is the difference between contents of monSysSQLText and monProcessSQLText MDA tables?
Ans. monSysSQLText provides a historical view of the most recently executed SQL text. monProcessSQLText table display SQL text that is currently being executed.
27. Conceptually, how does Adaptive Server maintain statistics – as an attribute of an index or an attribute of a column?
Ans. Statistics are maintained as an attribute of a column, not an index. Even though statistics are generated on the leading column of an index when an index is created, they are generated only once for the column. Thus, only one set of statistics exists for a column no matter how many indexes the column is part of. Statistics can also be generated for minor index attributes and non-indexed columns.
28. What is the point in adding statistics on non-indexed columns - after all, they can’t make the optimizer use an index if none exists?
Ans. Adding statistics on non-indexed columns can help the optimizer cost joins more accurately. Without statistics the optimizer makes assumptions as to the number of qualifying rows.
29. In what two situations could specifying additional histogram steps be useful?
Ans. Specifying additional histogram steps can be useful: - if the table is large, the search value falls between steps, and the optimizer is unable to determine an accurate selectivity estimate.
30. When does the optimizer rely on default selectivity values? Are they accurate? If not, can they be made more accurate?
Ans. In some cases where statistics are not available or usable, the optimizer may rely on default selectivity values which may or may not reflect reality. Default selectivity values can be modified to make them more accurately reflect the underlying data.
Will post few more FAQ soon .... watch this blog ....
No comments:
Post a Comment