Sybase/SAP
ASE is robust and popular for its stability, versatility and performance for mixed data
loads (Both OLTP and DSS). As database scalability is a biggest concern over
years, continuous optimization of critical resources like CPU, MEMORY, effective
QUERY writing and Storage for IO is necessary to satisfy the Clients requirements and keep
the response-time and throughput inline with expectations.
I would like to write this article on IO/Storage as Disk waits
are a huge performance bottleneck among critical resources. This problem becomes more apparent when the
amount of data starts to grow so large that 99%+ logical IO is being impossible
and some physical IO is always needed. Increase the number of available disk
spindles (and thereby reduce the seek overhead) by either create devices in
different disks or striping the disks besides portioning the table.
Let us first see the OS Level
optimization options later see Sybase options.
If you do
not need to know when files were last accessed (which is not really useful on a
database server), you can mount your file systems with the -o noatime option. That
skips updates to the last access time in inodes on the file system, which
avoids some disk seeks.
On many
operating systems, you can set a file system to be updated asynchronously by
mounting it with the -o async option. If your computer is reasonably
stable, this should give you better performance without sacrificing too much
reliability. (This flag is on by default on Linux.)
Striping means that you have many disks and put
the first block on the first disk, the second block on the second disk, and the N-th
block on the (N MOD number_of_disks) disk, and so on. This means if
your normal data size is less than the stripe size (or perfectly aligned), you
get much better performance. Striping is very dependent on the operating system
and the stripe size, so benchmark your application with different stripe sizes.
The speed difference for striping is very dependent on the parameters. Depending on how you
set the striping parameters and number of disks, you may get differences
measured in orders of magnitude. You have to choose to optimize for random or
sequential access.
For reliability, you may want to use RAID 0+1
(striping plus mirroring), but in this case, you need 2 × N drives to
hold N drives of data. This is probably the best. However, you may
also have to invest in some volume-management software to handle it
efficiently.
A good option is to vary the RAID level
according to how critical a type of data is. For example, store semi-important
data that can be regenerated on a RAID 0 disk, but store really important data
such as host information and logs on a RAID 0+1 or RAID N disk. RAID N can
be a problem if you have many writes, due to the time required to update the
parity bits.
Specific to Sybase (ASE
way of accessing Storage):
To make the most of physical database tuning, understand these
distinctions between logical and physical devices:
- The
physical disk or physical device is the hardware that stores the data.
- A
database device or logical device is all or part of a physical disk that has
been initialized (with the disk init command) for use by Database Server. A database device can be an operating system file, an entire disk, or
a disk partition.
- A
segment is a named collection of database devices used by a database. The
database devices that make up a segment can be located on separate physical
devices.
- A
partition is a subset of a table. Partitions are database objects that can be
managed independently. You can split partitioned tables, so multiple tasks can
access it simultaneously. You can place a partition on a specific segment. If
each partition is on a different segment and each segment has its own database
device, queries accessing these tables benefit from improved parallelism.
Improving performance
by controlling object placement
Adaptive Server allows you to
control the placement of databases, tables, and indexes across physical storage
devices, which can improve performance by equalizing the reads and writes to
disk across many devices and controllers. For example, you can:
- Place
database data segments on a specific device or devices, storing the database
log on a separate physical device so that reads and writes to the log do not
interfere with data access.
- Spread
large, heavily used tables across several devices.
- Place
specific tables or nonclustered indexes on specific devices. For example, you
might place a table on a segment that spans several devices and its
nonclustered indexes on a separate segment.
- Place
the text and image page chain for a table on a separate
device from the table. The table stores a pointer to the actual data value in
the separate database structure, so each access to a text or image column
requires at least two I/Os.
- Distribute
tables evenly across partitions on separate physical disks to provide optimum
parallel query performance and improve insert and update performance.
For multiuser and multi-CPU systems that
perform a lot of disk I/O, be especially aware of physical and logical device
issues and the distribution of I/O across devices:
·
Plan
a balanced separation of objects across logical and physical devices.
·
Use
enough physical devices, including disk controllers, to ensure physical
bandwidth.
·
Use
an increased number of logical devices to ensure minimal contention for
internal I/O queues.
·
Determine
and use a number of partitions that allows parallel scans and meets query
performance goals.
If
you experience problems due to disk contention and other problems related to
object placement, check for and correct these issues:
·
Random-access
(I/O for data and indexes) and serial-access (log I/O) processes use the same
disks.
·
Database
processes and operating system processes use the same disks.
·
Serial
disk mirroring.
·
Database
logging or auditing takes place on the same disk as data storage.
ASE provides options to Improve I/O performance by:
- Using segments
- Partitioning tables for
performance
- RAID devices and partitioned tables
To improve I/O performance in Adaptive Server, you can also try:
- ·
Spreading
data across disks to avoid I/O contention
- ·
Isolating
server-wide I/O from database I/O
- ·
Separating
data storage and log storage for frequently updated databases
- ·
Keeping
random disk I/O away from sequential disk I/O
- ·
Mirroring
devices on separate physical disks
- ·
Using
partitions to distribute table data across devices
A segment is a label that points to one or more
logical devices. Use segments to improve throughput by:
- Splitting
large tables across disks, including tables that are partitioned for parallel
query performance
- Separating
tables and their nonclustered indexes across disks
- Separating
table partitions and index across the disks
- Placing
the text and image page chain on a disk other than the one on which the table
resides, where the pointers to the text values are stored.
In addition, you can use segments to control
space usage:
- Tables
or partitions cannot grow larger than their segment allocation. You can use
segments to limit the table or partition size.
- Tables
or partitions on other segments cannot use the space allocated to objects on
another segment.
- The
threshold manager monitors space usage.
Partitioning a table can improve performance
for several types of processes.
- Partitioning
allows parallel query processing to access each partition of the table. Each
worker process in a partitioned-based scan reads a separate partition.
- Partitioning
allows you to load a table in parallel with bulk copy.
- For
more information on parallel bcp, see the Utility Programs manual.
- Partitioning
allows tou to distribute a table’s I/O over multiple database devices.
- Semantic
partitioning (range-, hash- and list-partitioned tables) improves response time
because the query processor eliminates some partitions.
- Partitioning
provides multiple insertion points for a heap table.
The tables you choose to partition and the type
of partition depend on the performance issues you encounter and the performance
goals for the queries on the tables.
RAID
A striped redundant array of independent disks
(RAID) device can contain multiple physical disks, but Adaptive Server treats
such a device as a single logical device. You can use multiple partitions on
the single logical device and achieve good parallel query performance.
To determine the optimum number of partitions
for your application mix, start with one partition for each device in the
stripe set. Use your operating system utilities (vmstat, sar, and iostat on
UNIX; Performance Monitor on Windows) to check utilization and latency.
To check maximum device throughput, use select
count(*), using the index table_name clause to force a table
scan if a nonclustered index exists. This command requires minimal CPU effort
and creates very little contention for other resources.
The Adaptive Server optimizer attempts to find the most efficient access
path to your data for each table in a query by estimating the cost of the
physical I/O needed to access the data, and the number of times each page must
be read while in the data cache.
Performance is critically depends on access methods in ASE.
In most database applications, there are many
tables in the database, and each table has one or more indexes. Depending on
whether you have created indexes, and what kind of indexes you have created,
the optimizer’s access method options include:·
- Table scan –
reading all the table’s data pages, sometimes hundreds or thousands of pages.
- Index access –
using the index to find only the data pages needed, sometimes as few as three
or four page reads in all.
- Index covering –
using only an index to return data, without reading the actual data rows,
requiring only a fraction of the page reads required for a table scan.
Make sure your IO is controlled with the above options so as to use optimized way of the storage and better throughput and response-time.
- Thanks for many blogs/sites, especially SAP/Sybase sites, helps me write this besides my experience :)
.