Go up to Performance Evaluation
Go forward to One Disk Configuration

Experimental Design

A problem when designing this evaluation study was the number of variables that could be manipulated: block size, number of groups with GSS, mix of media types, mix of requests, the number of participating disks, the number of disks that constitute a cluster per media type, the bandwidth of each disk as a function of the number of participating disks, closed versus open evaluation, the role of the tertiary storage device, the size of database, frequency of access to objects that constitute the database, etc. We spent weeks analyzing alternative ways of conducting this study. It was obvious that we had to reduce the number of manipulated parameters to obtain meaningful results. As a starting point, we decided to: (1) ignore the role of tertiary storage device and focus on the performance of Mitra during a steady state where all referenced objects are disk resident, and (2) focus on a single media type. Moreover, we partitioned this study into two parts. While the first focused on the performance of a single disk and the implementation techniques that enhance its performance, the second focuses on the scalability characteristics of Mitra as a function of additional disks.
 
(a) Number of votes for each clips

(b) Length [in seconds] of each clip

Figure 7: Characteristics of the CD audio clips

The target database and its workload were based on a WWW page that ranks the top fifty songs every week(7). We chose the top 22 songs of January 1995 to construct both the benchmark database and its workload. (We could not use all fifty because the total size of the top 22 audio clips exhausted the storage capacity of one disk Mitra configuration.) Figure 7a and b shows the frequency of access to the clips and the size of each clip in seconds, respectively. The size of the database was fixed for all experiments.

We employed a closed evaluation model with zero think time for our evaluation. With this model, a workload generator process is aware of the number of simultaneous displays supported by a configuration of Mitra (say N). It dispatches N requests for object displays to Mitra. (Two or more requests may reference the same object, see below.) As soon as Mitra is done with the display of a request, the workload generator issues another request to the Scheduler (zero think time). The distribution of request references to clips is based on Figure 7a. This is as follows. We normalized the number of votes to the 22 clips as a function of the total number of vote for these objects. The workload generator employs this distribution to construct a queue of requests that reference the 22 clips. This queue of requests is randomized to result in a non-deterministic reference pattern. However, it might be the case that two or more requests reference the same clip (e.g., the popular clip) at the same time. Unless noted otherwise, Mitra was not configured to multiplex a single stream to service these requests.

Seagate ST31200W
Capacity 1.0 gigabyte
Revolutions per minute 5400
Maximum seek time 21.2 millisecond
Maximum rotational latency 11.1 millisecond
Number of zones 23 (see Figure 4)
Database Characteristics
CD Quality Audio
Sampling rate 44,100 per second
Resolution 16 bits
Channels 2 (stereo)
Bandwidth requirement 1.3458 Mbps

Table 3: Fixed Parameters

This experimental design consists of three states: warmup, steady state, and shutdown. During the system warmup (shutdown), Mitra starts to become fully utilized (idle). In our experiments, we focused on the performance of Mitra during a steady state by collecting no statistics during both system warmup and shutdown.

Up Next