|
21.07.2006
DHCAT Scoring Methodology
The constitution also guided the development team as it created the DHCAT scoring system, one of the new tool’s most critical and sophisticated elements. Three very different assessment and scoring processes are applied to the different types of Digital Home video primitives.
Scoring Video Quality-based Primitives
The most difficult challenge facing the development team was creating an assessment process that would replicate, capture and accurately quantify a user’s subjective judgment of video image quality. In essence, the problem was how to embed a virtual human jury in the tool itself.
As it turns out, the DHCAT team’s problem was far from unique. Research in the broadcast and telecommunications industries has produced a significant body of knowledge about the interaction between delivered video/audio content and human physiology, and what influences perceived quality of a media experience. Those insights underlie the technical discipline of perceptual modeling, the use of mathematical analysis and modeling techniques to accurately predict user judgments. The practice was used extensively in the development of new industry standards for high definition television. For more about how DHCAT uses perceptual modeling, please see this article on the Intel® Capabilities Forum.
Working with specialists at Intel’s User Centered Design Group, and Psytechnics*, a leader in the development of perceptual modeling techniques, the development team created perception models specifically for video quality in three separate Digital Home applications: video playback, recording and streaming. In each application, the model maps an objectively measurable performance attribute measured by DHCAT to a Mean Opinion Score (MOS) that reflects the quality score that human evaluators would have provided. Numerical MOS scores are broadly rated as either acceptable or unacceptable, with finer-grained quality grades ranging from poor to excellent.
- Video Playback Quality is measured by playing a video file on the test system, holding frame quality constant. DHCAT measures frame rate variation, and calculating the root-mean square error (RMSE). The lower the RMSE, the higher the MOS score awarded by the perceptual model.
- Video Recording Quality is measured by recording a reference video file on the test system, then using Psytechnics*’ video analysis tool, to compare the recorded and reference files to quantify degradation. The amount of measured degradation in the recorded file is then mapped to a MOS using the perceptual model.
- Video Streaming Quality is measured by streaming a video file from the test system to the DHCAT’s virtual digital media adapter (DMA). One of the principal goals of DHCAT was that it be self-contained, so the DHCAT team created a virtual DMA that can receive a video stream from the tested platform’s video server. Every video file has an expected run length. However, if a system is overworked, the video server will not be able to deliver smoothly the video stream to a DMA. The likely result will be freeze-frames while the DMA waits for new video frames from the server. From a user study, Intel’s User Centered Design Group determined that on a 24-second video clip, people will tolerate about three seconds of freeze frames – increasing actual playback time to 27 seconds – before MOS scores drop off considerably. So DHCAT’s streaming video methodology looks at the actual playback time, and compares it to the expected playback time. The bigger the difference between the two, the lower the MOS score.
Scoring Response Time-based Primitives
Only one of the primitives included in DHCAT v1.5 is scored on the basis of response time—Prepare video for a Portable Media Player (PMP). It is tested by encoding a 57-second video clip for transfer to a PMP. The encoding time is recorded and compared to the original file runtime, providing a ratio that is referred to as the speedup factor.
Scoring Capability Check Primitives
Six of the primitives included in DHCAT v1.5 are scored on a pass/fail basis, depending on whether or not that functionality is supported in the test system. The six current DHCAT capability checks are:
- Listen to stereo audio
- Watch your favorite TV show
- Watch your favorite HDTV show
- Record two of your favorite Standard Definition TV (SDTV) shows
- Listen to High Definition Audio
- Presence of a DLNA* v1.0-compliant media server.
Scoring Multiple Instantiations
DHCAT will run video-based primitives up to four times to accommodate the four supported video formats : WMV*, DivX*, MPEG-2 and QuickTime* -- if they are present on the test platform. Each instance produces a separate run score, all of which are recorded and included in aggregate scores. This provides higher aggregate scores for test systems that support more video formats, which enables a wider array of content to run on the platform.
|