banner ad

Copyright 2002 Society of Photo �- Optical Instrumentation Engineers. This paper will be published in Proc. Astronomical Telescopes and Instrumentation 2002 and is made available as an electronic preprint with permission of SPIE. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or commercial purposes, or modification of the content of this paper are prohibited.

Share |

Dr. Immanuel Freedman

ABSTRACT
Many scientists would like to be able to view and analyze quick look astronomical data on hand held devices linked by wireless network to the Internet. Scientific data is often characterized by high dynamic range together with abrupt, localized or extended changes of spatial and temporal statistical properties. I compare the effectiveness of algorithms for the efficient approximation of scientific data that support low bit-rate, near real-time and low-delay communication of heterogeneous multidimensional scientific data over existing or planned wireless networks.

Keywords: data analysis–image processing–astronomical data bases:miscellaneous

1. INTRODUCTION

Many scientists–including astronomers–are aware that hand held devices offer inexpensive mobile displays together with near-laptop computing and wireless communications capabilities. The recent advent of large virtual observatories (European AVO and NASA’s Sky View…), orbiting telescopes (HST, NGST, Chandra…), manned space missions (Mission to Planet Mars…) and large ground-based CCD spectrographs (SUBARU…) all indicate an urgent need for the development of improved quick look imagery that can be displayed for rapid decision-making on mobile hand-held devices linked by wireless network to the Internet.

In this paper, I propose solutions that highlight the role of approximation in the visual, symbolic and graphic presentation of astronomical and engineering data for intuitive understanding and exploratory data analysis. Accurate astrometry and photometry may, in principle, be made available on demand for user-selected celestial objects, regions of interest, full sky background, ancillary measurements or engineering data sets.

The horizon time for taking action based on decisions made with the help of quick look data may vary widely with application. For example, the ALADIN project1 considers that 20 seconds is the maximum delay an end-user can be expected to wait for a 2MB archived image served over the Web. The SUBARU telescope Quicklook Producer2 is run daily at the Hilo Base Facility and typically generates about 500 kB compressed imagery from a 16x10MB data set based on a 10-15 minute observation made by the Suprime-Cam and other instruments. Such quick look data could take up to 60-200 seconds to transfer over a 14.4-28.8 kbps phone-grade network connected to the Internet. If a network of observers were to be provided with quick look data served to inexpensive mobile hand held platforms, they could place telephone calls or initiate remote system commands within minutes of discovering serious instrument or telescope malfunctions (e.g., the collapse of the Green Bank telescope or the failure of a gyroscope on board the COBE mission) or claim priority within hours for the serendipitous discovery of astronomical objects.

In this paper, I adopt cost and performance goals for a system based on mobile hand held devices of about 25-50% of the capital cost and about 100% of the operational cost of a system based on laptop computers, together with performance and quality goals of retrieving and displaying pre-processed data of visual quality satisfactory to the user within about 30 seconds of a user demand.

The determination of visual quality is application-dependent. The hand-drawn sketches with which Galileo announced the discoveries of Jovian satellites3, Saturnian rings and the existence of Sun spots are strikingly simple. In the analogous field of cartographic generalization4, map makers extract and simplify relevant information and present it at smaller scales in a less cluttered5 way, while meeting defined specifications on topology or geometry. In this paper I focus primarily on subjective quality assessments with only brief discussion of computation time and encoding delay.

Astronomical data is often heterogeneous in that variables of differing statistical properties are often stored in the same file or data stream but are not necessarily optimally organized for data compression. Such data is often also multidimensional and samples many associated variables. Examples include hyperspectral imagery (e.g., COBE/DIRBE data, or SUBARU/HDS echelle spectrography), imagery that evolves with time (e.g., SOHO observations of solar activity) and astronomical catalog entries indicating observational variables including object name, epoch, coordinate system, position, spectral class, spectral filter, time, exposure and observer’s comments together with derived physical quantities including distance, redshift, mass, temperature and velocity dispersion.

The plan of the paper is as follows. In Section 2, I present a system overview and compare hand held devices with laptop computers in terms of specification and cost. I also compare existing or planned wireless with broadband networks in terms of available bandwidth and data transmission cost. In Section 3, I develop novel methods and combinations of methods of data approximation. In Section 4, I compare algorithms in terms of visual quality, bit rate and encoding delay. In Section 5, I present a summary of the findings and develop conclusions.

2. SYSTEM OVERVIEW AND COST COMPARISON

Platform

Processing Resource

Operating System

Display

Expansion Options

Manufacturer’s Suggested Retail Price

PocketPC

Intel Strong ARM

Win CE

 Linux?

240x320x16

SD Card Slot

 

about $730

FOMA

Unknown

Unknown

176x144x12

Unknown

about $500

Palm

Motorola 68K

PalmOS

Linux?

160x160x8 (effectively 160x120x8)

SD Card Slot

 

about $199

Laptop

x86

Many

1024x768x24

Many

about $1k to $5k

Table 1 Comparing the cost and specification of platform resources.

 

Network

Max (Typical) Speed

Suggested Fixed Cost

Suggested Variable Cost

UMTS

2 Mbps

Unknown

Unknown

FOMA

384 kbps (64 kbps now, 325 kbps next phase)

About $30/month

About $4.2E-4

per128-byte packet

GPRS

128  kbps (40 kbps )

Unknown

Unknown

iMode

9.6 kbps (5 kbps )

About $10-20/month

About $2.5E-3

per 128-byte packet

xDSL 1.5 Mbps

1.5Mbps (300kbps)

About $30/month

None

Figure 1 indicates the primary components of a system based on hand held devices for the evaluation of quick look astronomical data. Although there are still unknowns in the cost and capability information, Table 1 and Table 2 clearly show that systems based on hand held devices (especially those based on FOMA) can be cost-competitive with laptop systems for astronomical quick look use since at best they can deliver almost 10 MB image data in about 30 seconds. A modest compression ratio of 16:1 would suffice to deliver SUBARU observations comprising 16x10 MB in less than 30 seconds.

In passing, note that a hand held system based on the Linux operating system may become capable of running popular astronomical image processing packages such as ESO-MIDAS, AIPS or IRAF, IDL or MATLAB.

3. DATA APPROXIMATION ALGORITHMS

1.Acquisition

2*.Preprocess

3.Encode

4*.Postprocess

5.Transfer

6.Decode

7.Display

From source

None

See Table 4

None

Network

per Table 4

Dither

From Source

Unsharp Mask

See Table 4

Deconvolution

Network

per Table 4

Dither

Table 3. Description of the overall sequence of steps (including optional steps denoted by *).

 

Method

1.Transform

2.Quantize Coefficients

4.Organize Coefficients

5 Reduce Geometrical Redundancy

6. Reduce Statistical Redundancy

7.* Model. Residuals

DCTune

DCT 8x8

Perceptual

Zigzag sequence

Runlength coding

Huffman

Rice  coding,

Summary statistics,

Discrete Sine Transform

JPEG-2000

Wavelet

Perceptual

Pyramidal

Yes

Yes

Rice coding,

Summary statistics

FRIT

Finite Radon + Symmlet (4)

Retain top 0.5%

Zigzag seqence

Index of non-zero coefficients

gzip

Rice coding, Summary statistics

Spectral decorrelation

PCA

Retain top 20% of  components

Linear, spectral

Follow with spatial coding

Huffman or gzip

Rice coding, Summary statistics

Spectral block matching

MPEG-4

Perceptual

Zigzag sequence

Runlength coding

Huffman

Rice coding, Summary statistics

Entropy

gzip

Arithmetic

No

Organized by data variable

None

LZW

Arithmetic

None

DPCM

DPCM

No

Non-causal predictor

None

None

Rice coding

Connected-component labeling

Morphological

Yes

Linear

Yes

Yes

Summary statistics

Cartographic generalization

Gradient Vector Flow snake

Line simplification

Linear, possible tree

Yes

gzip

Summary statistics

Table 4. Description and comparison of the sequence of steps used in the Encode step of Table 3.

 

Table 3 and Table 4 indicate the sequence of steps and combinations of methods of data approximation I used to explore the relationship between visual quality, bit rate and encoder delay.

 

DCTune6 provides perceptual quantization matrices for the Independent JPEG Group encoder .I set the viewing distance at 16 inches for hand held devices rather than the usual 23.5 inches for desktop computers.  JPEG-20007 is a scalable wavelet-encoder based standardized by the JPEG committee.  FRIT

is an extension of a Finite Ridgelet Transform8 encoder in which the retained coefficients are encoded initially by position and subsequently entropy coded.  Decorrelation9 of the spectral channels provides an effective color space for subsequent spatial coding.  MPEG-4 provides motion compensation by block matching for sequences of movie frames with strong spatial correlation.  Although the spectral channels are certainly not temporal in nature, if strong spectral correlations exist, the same block matching concept may, in principle, beused to reduce spectral redundancy.  DPCM extends ENCODE10 to multiple dimensions through a non-causal nearest neighbor predictor.  Cartographic generalization refers to a method of segmenting images by active contours known as Gradient Vector Flow snakes11 for which methods of line simplification12 and generalization are known.

 

The DPCM has the lowest encoder delay as only the first or second nearest neighbor pixels which constitute the Minimum Coded Unit (MCU) need buffering.  The Spectral Decorrelation Method reduces the delay when compared to methods that process all spectral channels.  DCTune and JPEG-2000 conventionally buffer a full frame of imagery; however the MCU is 8x8 pixels for each spectral channel to be processed.  Likewise the MPEG-4 MCU may, in principle, be as small as a Group of Blocks (for H.263 compatibility) but is usually the size of a Group of Pictures (at least 2 pictures for P frames, 3 pictures for B frames).  The minimum coded unit for the FRIT method is one block of pixels constituting a finite projective space.  For this application I considered blocks whose size in each dimension is a power of prime numbers. The encoding delay for entropy coding depends on the application and the size of buffer.  Note, in passing, that Arithmetic Entropy Coding has, in principle, no delay.  Connected Component Labeling refers to a morphological technique for creating a graph sketch of extended objects which in turn can be visually presented as a sketch.  The MCU is the size of the connected set used in the search process.  The simplification and generalization techniques used in the Cartographic Generalization approach may have delays ranging from a line segment (to be processed by the Douglas-Peucker algorithm) to a full frame.

 

JPEG-2000 and DCTune (JPEG) blur images and JPEG displays blocking artifacts for compression factors exceeding about 40:1.  I explored the Point Spread Function of the JPEG-2000 and JPEG transforms as a function of spatial position within a 2D image and was able to restore the compressed image to yield about 2dB improvement relative to the original image by deconvolution with the worst case Point Spread Function. I applied the Lucy-Richardson method implemented in the ESO-MIDAS data analysis package.  Unsharp masking the original image yielded modest improvement in the visual quality of the compressed image.

 

JPEG blocking artifacts may be effectively suppressed by adding a small contribution from the Discrete Sine Transform13.  Furthermore, the residual carries information about instrument noise, sky background and unresolved sources.  Although Rice coding has been used effectively to encode residuals with distribution similar to Laplacian, an alternative approach is to report summary statistics that characterize the stochastic process of the residuals.   Such statistics include the anisotropy or energy of the residuals (components of the structure tensor), minimum and maximum residual sky brightness or the parameters of an Autoregressive Moving Average model.  Although it is often stated that noise is incompressible, identification of the noise process parameters conveys information to reconstruct the noise.

 

The cfitsio package14 implements a tiled data compression scheme within the framework of the FITS standard.  If data variables of differing statistical properties can be segmented into separate tiles, they can be approximated by methods optimized for such data.

 

4.QUALITY ASSESSMENT

 

I determined the visual quality resulting from application of the approximation techniques detailed in Table 3 and 4 to public domain images obtained via NASA’s SkyView, the HST Heritage website together with full images from the SUBARU telescope observation archive.  In particular, I focused on a Digitized Sky Survey image at 1024x1024 resolution centered on the Coma Cluster (Epoch J2000.0, Rectangular Equatorial coordinates, Gnomonic projection), a full-sky 10-band COBE/DIRBE image (Epoch J2000.0, Galactic coordinates, Zenith Equal Area projection) and an HST color image of the filamentary nebulosity surrounding the Cas A remnant.

 

Figure 2 below demonstrates the visual quality resulting from converting an 8-bit 1024x1024 Digitized Sky Survey image of the Coma Cluster from NASA’s SkyView to a 4 bit-per-pixel image displayed by the FireView™ application at 160x120 resolution on a Palm IIIc emulator.  The image is scrollable and zoomable with size 187k, a compression factor of 5.6:1.

 

Figure 3 indicates the visual quality of an original full-sky map measured by the COBE/DIRBE mission.  In particular, Figure 3(a) indicates the quality of a histogram-equalized Band 8 image raster.  I decorrelated the spectrum via Principal Components Analysis of the image intensity in a 100x100x10 pixel window chosen arbitrarily at the  Galactic Center.  Although the spectral signature undoubtedly varies across the image and perhaps also with time, the color space resulting from this approximate decorrelation sufficed to obtain a high visual quality reconstruction from the single most energetic component spectral band.  This provided a compression factor of 10:1 additional to that provided by compression in the spatial domain.  Subsequent compression by a JPEG-2000 to 0.4bpp, 0.2bpp and 0.03 bpp  resulted in the images displayed in Figure 3(b),(c) and (d). with an overall compression factor of 250:1.  Fig. 3(e) shows the result of JPEG -2000 compression after spectral decorrelation at 1 bpp with additional deconvolution of the compressed image by the measured JPEG-2000 Point Spread Function and 3(f) shows the original image compressed to 0.03 bpp via the FRIT.

 

Figure 4 depicts three visual representations of an HST image of the Cas A remnant.  Figure 4(a) shows the color image reduced by a factor of 2 in size and from 24-bit color to 16-bit gray scale of size 565 kB uncompressed (358K compressed via gzip) together with Figure 4(b) depicting a freehand artistic sketch of size 49 kB uncompressed (17 kB compressed via gzip) and Figure 4(c) illustrating a half-size freehand sketch of size 47kB uncompressed (1.8KB compressed via gzip).  Even without line simplification, the sketches show a remarkable reduction in data volume while extracting useful topological information from the image.  In the half-size sketch I have replaced the bright star images by points and reduced the polygonal outline of the filaments to a centerline.  Active contours (e.g., snakes) show promise in extracting such information and further research work in this area would be of value to astronomers.

 

By providing simple representations of data that are not misleading to astronomers, it is possible to provide higher quality information with fewer bits of data.  Although astronomers frequently refer to images as sky maps, cartographic techniques such as exaggeration, simplification, aggregation and displacement of features are rarely applied.  I represent a crowded stellar field detected by a data reduction system such as ESO MIDAS by an oval or polygonal boundary with point sources such as bright stars and galaxies represented by points or symbols.  Filaments are reminiscent of linear features and may be simplified like roads.

 

Page 1 0f 2.
Continue to page 2.

Share |


Dr. Immanuel Freedman, Inc.
8677 Villa La Jolla Dr. #1119, La Jolla CA USA 92037
Phone: 619-884-1347 Fax: 619-374-1931

See his/her Listing on Experts.com.

©Copyright 2003 - All Rights Reserved

DO NOT REPRODUCE WITHOUT WRITTEN PERMISSION BY AUTHOR.