地理信息系统与遥感技术在测绘中的融合与应用研究地理信息系统(Geographic Information System,简称GIS)和遥感技术(Remote Sensing)是现代测绘领域中两种不可或缺的技术手段。
外文资料与中文翻译Metrics of scale in remote sensing and GISMichael F Goodchild(National Center for Geographic Information and Analysis, Department of Geography, University of California, Santa Barbara)ABSTRACT: The term scale has many meanings, some of which survive the transition from analog to digital representations of information better than others. Specifically, the primary metric of scale in traditional cartography, the representative fraction, has no well-defined meaning for digital data. Spatial extent and spatial resolution are both meaningful for digital data, and their ratio, symbolized as US, is dimensionless. US appears confined in practice to a narrow range. The implications of this observation are explored in the context of Digital Earth, a vision for an integrated geographic information system. It is shown that despite the very large data volumes potentially involved, Digital Earth is nevertheless technically feasible with today’s technology.KEYWORDS: Scale, Geographic Information System , Remote Sensing, Spatial ResolutionINTRODUCTION: Scale is a heavily overloaded term in English, withabundant definitions attributable to many different and often independent roots, such that meaning is strongly dependent on context. Its meanings in “the scales of justice” or “scales over ones eyes” have little connection to each other, or to its meaning in a discussion of remote sensing and GIS. But meaning is often ambiguous even in that latter context. For example, scale to a cartographermost likely relates to the representative fraction, or the scaling ratio between the real world and a map representation on a flat, two-dimensional surface such as paper, whereas scale to an environmental scientist likely relates either to spatial resolution (the representati on’s level of spatial detail) or to spatial extent (the representation’s spatial coverage). As a result, a simple phrase like “large scale” can send quite the wrong message when communities and disciplines interact - to a cartographer it implies fine detail, whereas to an environmental scientist it implies coarse detail. A computer scientist might say that in this respect the two disciplines were not interoperable.In this paper I examine the current meanings of scale, with particular reference to the digital world, and the metrics associated with each meaning. The concern throughout is with spatial meanings, although temporal and spectral meanings are also important. I suggest that certain metrics survive the transition to digital technology better than others.The main purpose of this paper is to propose a dimensionless ratio of two such metrics that appears to have interesting and useful properties. I show how this ratio is relevant to a specific vision for the future of geographic information technologies termed Digital Earth. Finally, I discusshow scale might be defined in ways that are accessible to a much wider range of users than cartographers and environmental scientists.FOUR MEANINGS OF SCALE LEVEL OF SPATIAL DETAIL REPRESENTATIVE FRACTIONA paper map is an analog representation of geographic variation, rather than a digital representation. All features on the Earth’s surface are scaled using an approximately uniform ratio known as the representative fraction (it is impossible to use a perfectly uniform ratio because of the curvature of the Earth’s surface). The power of the representative fraction stems from the many different properties that are related to it in mapping practice. First, paper maps impose an effective limit on the positional accuracy of features, because of instability in the material used to make maps, limited ability to control the location of the pen as the map is drawn, and many other practical considerations. Because positional accuracy on the map is limited, effective positional accuracy on the ground is determined by the representative fraction. A typical (and comparatively generous) map accuracy standard is 0.5 mm, and thus positional accuracy is 0.5 mm divided by the representative fraction (eg, 12.5 m for a map at 1:25,000). Second, practical limits on the widths of lines and the sizes of symbols create a similar link between spatial resolution and representative fraction: it is difficult to show features much less than 0.5 mm across withadequate clarity. Finally, representative fraction serves as a surrogate for the features depicted on maps, in part because of this limit to spatial resolution, and in part because of the formal specifications adopted by mapping agencies, that are in turn related to spatial resolution. In summary, representative fraction characterizes many important properties of paper maps.In the digital world these multiple associations are not necessarily linked. Features can be represented as points or lines, so the physical limitations to the minimum sizes of symbols that are characteristic of paper maps no longer apply. For example, a database may contain some features associated with 1:25,000 map specifications, but not all; and may include representations of features smaller than 12.5 m on the ground. Positional accuracy is also no longer necessarily tied to representative fraction, since points can be located to any precision, up to the limits imposed by internal representations of numbers (eg,single precision is limited to roughly 7 significant digits, double precision to 15). Thus the three properties that were conveniently summarized by representative fraction - positional accuracy, spatial resolution, and feature content - are now potentially independent.Unfortunately this has led to a complex system of conventions in an effort to preserve representative fraction as a universal defining characteristic of digital databases. When such databases are created directly from paper maps, by digitizing or scanning, it is possible for all three properties to remain correlated. But in other cases the representative fraction cited for a digital database is the one implied by its positional accuracy (eg, a database has representative fraction 1: 12,000 because its positional accuracy is 6 m); and in other cases it is the feature content or spatial resolution that defines the conventional representative fraction (eg, a database has representative fraction 1:12,000 because features at least 6 m across are included). Moreover, these conventions are typically not understood by novice users - the general public, or children - who may consequently be very confused by the use of a fraction to characterize spatial data, despite its familiarity to specialists.SPATIAL EXTENTThe term scale is often used to refer to the extent or scope of a study or project, and spatial extent is an obvious metric. It can be defined in area measure, but for the purposes of this discussion a length measure is preferred, and the symbolL will be used. For a square project area it can be set to the width of the area, but for rectangular or oddly shaped project areas the square root of area provides a convenient metric. Spatial extent defines the total amount of information relevant to a project, which rises with the square of a length measure.PROCESS SCALEThe term process refers here to a computational model or representation of a landscape-modifying process, such as erosion or runoff. From a computational perspective,a process is a transformation that takes a landscape from its existing state to some new state, and in this sense processes are a subset of the entire range of transformations that can be applied to spatial data.Define a process as a mapping b (x ,2t )=f ( a (x ,1t )) where a is a vector of input fields, b is a vector of output fields, f is a function, t is time, 2t is later in time than 1t , and x denotes location. Processes vary according to how they modify the spatial characteristics of their inputs, and these are best expressed in terms of contributions to the spatial spectrum. For example, some processes determine b(x, ,2t ) based only on the inputs at the same location a(x, 1t ), and thus have minimal effect on spatial spectra.Other processes produce outputs that are smoother than their inputs, through processes of averaging or convolution,and thus act as low-pass filters. Less commonly, processes produce outputs that are more rugged than their inputs, by sharpening rather than smoothing gradients, and thus act as high-pass filters.The scale of a process can be defined by examining the effects of spectral components on outputs. If some wavelength s exists such that components with wavelengths shorter than s have negligible influence on outputs, then the process is said to have a scale of s. It follows that if s is less than the spatial resolution S of the input data, the process will not be accurately modeled.While these conclusions have been expressed in terms of spectra, it is also possible to interpret them in terms of variograms and correlograms. A low-pass filter reduces variance over short distances, relative to variance over long distances. Thus the short-distance part of the variogram is lowered, and the short-distance part of the correlogram is increased. Similarly a high-pass filter increases variance over short distances relative to variance over long distances.L/S RATIOWhile scaling ratios make sense for analog representations, the representative fraction is clearly problematic for digital representations.But spatial resolution and spatial extent both appear to be meaningful in both analog and digital contexts, despite the problems with spatial resolution for vector data. Both Sand L have dimensions of length, so their ratio is dimensionless. Dimensionless ratios often play a fundamental role in science (eg, theReynolds number in hydrodynamics), so it is possible that L/S might play a fundamental role in geographic information science. In this section I examine some instances of the L/S ratio, and possible interpretations that provide support for this speculation.- Today’s computing industry seems to have settled on a screen standard of order 1 megapel, or 1 million picture elements. The first PCs had much coarser resolutions (eg, the CGA standard of the early 198Os), but improvements in display technology led to a series of more and more detailed standards. Today, however, there is little evidence of pressure to improve resolution further, and the industry seems to be content with an L/S ratio of order 103. Similar ratios characterize the current digital camera industry, although professional systems can be found with ratios as high as 4,000.- Remote sensing instruments use a range of spatial resolutions, from the 1 m of IKONOS to the 1 km of AVHRR. Because a complete coverage of the Earth’s surface at 1 m requires on the order of 1015 pixels, data are commonly handled in more manageable tiles, or approximately rectangular arrays of cells. For years, Landsat TM imagery has been tiled in arrays of approximately 3,000 cells x 3,000 cells, for anL/S ratio of 3,000.- The value of S for a paper map is determined by the technology of map-making, and techniques of symbolization, and a value of 0.5 mm is not atypical. A map sheet 1 m across thus achieves an L/S ratio of 2,000.- Finally, the human eye’s S can be defined as the size of a retinal cell, and the typical eye has order 108 retinal cells, implying an L/S ratio of 10,000. Interestingly, then, the screen resolution that users find generally satisfactory corresponds approximately to the parameters of the human visual system; it is somewhat larger, but the computer screen typically fills only a part of the visual field.These examples suggest that L/S ratios of between 103 and 104 are found across a wide range of technologies and settings, including the human eye. Two alternative explanations immediately suggest themselves: the narrow range may be the result of technological and economic constraints, and thus may expand as technology advances and becomes cheaper; or it may be due to cognitive constraints, and thus is likely to persist despite technological change.This tension between technological, economic, and cognitive constraints is well illustrated by the case of paper maps, which evolved under what from today’s perspective were severe technological and economic constraints. For example, there are limits to the stability of paper and to the kinds of markings that can be made by hand-held pens. The costs of printing drop dramatically with the number of copies printed, because of strong economies of scale in the printing process, so maps must satisfy many users to be economically feasible. Goodchild [2000]has elaborated on these arguments. At the same time, maps serve cognitive purposes, and must be designed to convey information as effectively as possible. Any aspect of map design and production can thus be given two alternative interpretations: one, that it results from technological and economic constraints, and the other, that it results from the satisfaction of cognitive objectives. If the former is true, then changes in technology may lead to changes in design and production; but if the latter is true, changes in technology may have no impact.The persistent narrow range of L/S from paper maps to digital databases to the human eye suggests an interesting speculation: That cognitive, not technological or economic objectives, confine L/S to thisrange. From this perspective,L/S ratios of more than 104 have no additional cognitive value, while L/S ratios of less than 103 are perceived as too coarse for most purposes. If this speculation is true, it leads to some useful and general conclusions about the design of geographic information handling systems. In the next section I illustrate this by examining the concept of Digital Earth. For simplicity, the discussion centers on the log to base 10 of the L/S ratio, denoted by log L/S, and the speculation that its effective range is between 3 and 4.This speculation also suggests a simple explanation for the fact that scale is used to refer both to L and to S in environmental science, without hopelessly confusing the listener. At first sight it seems counter~ntuitive that the same term should be used for two independent properties. But if the value of log L/S is effectively fixed, then spatial resolution and extent are strongly correlated: a coarse spatial resolution implies a large extent, and a detailed spatial resolution implies a small extent. If so, then the same term is able to satisfy both needs.THE VISION OF DIGITAL EARTHThe term Digital Earth was coined in 1992 by U.S. Vice President Al Gore [Gore, 19921, but it was in a speech written for delivery in 1998 thatGore fully elaborated the concept (www.d~~Pl9980131 .html): “Imagine, for example, a young child going to a Digital Earth exhibit at a local museum. After donning a headmounted display, she sees Earth as it appears from space. Using a data glove, she zooms in, using higher and higherlevels of resolution, to see continents, then regions, countries, cities, and finally individual houses, trees, and other natural and man-made objects. Having found an area of the planet she is interested in exploring, she takes the equivalent of a ‘magic carpet ride’ through a 3- D visualization of the terrain.”This vision of Digital Earth (DE) is a sophisticated graphics system, linked to a comprehensive database containing representations of many classes of phenomena. It implies specialized hardware in the form of an immersive environment (a head-mounted display), with software capable of rendering the Earth’s surface at high speed, and from any perspective. Its spatial resolution ranges down to 1 m or finer. On the face of it, then, the vision suggests data requirements and bandwidths that are well beyond today’s capabilities. If each pixel of a 1 m resolution representation of the Earth’s surface was allocated an average of 1 byte then a total of 1 Pb of storage would be required; storage of multiple themes could push this total much higher. In order to zoom smoothly down to 1 m it would be necessary to store the data in a consistent data structure that could be accessed at many levels of resolution. Many data types are not obviously renderable (eg, health, demographic, andeconomic data), suggesting a need for extensive research on visual representation.The bandwidth requirements of the vision are perhaps the most daunting problem. To send 1 Pb of data at 1 Mb per second would take roughly a human life time, and over 12,000 years at 56 Kbps. Such requirements dwarf those of speech and even full-motion video. But these calculations assume that the DE user would want to see the entire Earth at Im resolution. The previ ous analysis of log L/S suggested that for cognitive (and possibly technological and economic) reasons user requirements rarely stray outside the range of 3 to 4, whereas a full Earth at 1 m resolution implies a log L/S of approximately 7. A log L/S of 3 suggests that a user interested in the entire Earth would be satisfied with 10 km resolution; a user interested in California might expect 1 km resolution; and a user interested in Santa Barbara County might expect 100 m resolution. Moreover, these resolutions need apply only to the center of the current field of view.On this basis the bandwidth requirements of DE become much more manageable. Assuming an average of 1 byte per pixel, a megapel image requires order 107 bps ifrefreshed once per second. Every one-unit reduction in log L/S results in two orders of magnitude reduction in bandwidth requirements. Thus a Tl connection seems sufficient to support DE, based on reasonable expectations about compression, and reasonable refresh rates. On this basis DE appears to be feasible with today’s communication technology. CONCLUDING COMMENTSI have argued that scale has many meanings, only some of which are well defined for digital data, and therefore useful in the digital world in which we increasingly find ourselves. The practice of establishing conventions which allow the measures of an earlier technology - the paper map - to survive in the digital world is appropriate for specialists, but is likely to make it impossible for non-specialists to identify their needs. Instead, I suggest that two measures, identified here as the large measure L and the small measure S, be used to characterize the scale properties of geographic data.The vector-based representations do not suggest simple bases for defining 5, because their spatial resolutions are either variable or arbitrary. On the other hand spatialvariat;on in S makes good sense in many situations. In social applications, itappears that the processes that characterize human behavior are capable of operating at different scales, depending on whether people act in the intensive pedestrian-oriented spaces of the inner city or the extensive car-oriented spaces of the suburbs. In environmental applications, variation in apparent spatial resolution may be a logical sampling response to a phenomenonthat is known to have more rapid variation in some areas than others; from a geostatistical perspective this might suggest a non-stationary variogram or correlogram (for examples of non-statjonary geostatistical analysis see Atkinson [2001]). This may be one factor in the spatial distribution of weather observation networks (though others, such as uneven accessibility, and uneven need for information are also clearly important).The primary purpose of this paper has been to offer a speculation on the significance of the dimensionless ratio L/S. The ratio is the major determinant of data volume, and consequently processing speed, in digital systems. It also has cognitive significance because it can be defined for the human visual system. I suggest that there are few reasons in practice why log L/S should fall outside the range 3 - 4, and that this provides an important basis for designing systems for handling geographic data. Digital Earth was introduced as one such system. A constrained ratio also implies that L and S are strongly correlated in practice, as suggested by the common use of the same term scale to refer to both.ACKNOWLEDGMENTThe Alexandria Digital Library and its Alexandria Digital Earth Prototype, the source of much of the inspiration for this paper, are supported by the U.S. National Science Foundation.REFERENCESAtkinson, P.M., 2001. Geographical information science: Geocomputation and nonstationarity. Progress in Physical Geography 25(l): 111-122. Goodchild, M F 2000 Communicating geographic information in a digital age. Annals of the Association of American Geographers 90(2): 344-355.Goodchild, M.F. & J. Proctor, 1997. Scale in a digital geographic world.Geographical and Environmental Modelling l(1): 5-23.Gore, A., 1992. Earth in the Balance: Ecology and the Human Spirit.Houghton Mifflin, Boston, 407~~.Lam, N-S & D. Quattrochi, 1992. On the issues of scale, resolution, and fractal analysis in the mapping sciences. Professional Geographer 44(l): 88-98.Quattrochi D.A & M.F. Goodchild (Eds), 1997. Scale in Remote Sensing and GIS. Lewis Publishers, Boca Raton, 406~~.中文翻译:在遥感和地理信息系统的规模度量迈克尔·F古德柴尔德(美国国家地理信息和分析中心,加州大学圣巴巴拉分校地理系)摘要:长期的规模有多种含义,其中一些生存了从模拟到数字表示的信息比别人更好的过渡。
遥感及其相关术语中英文对照(不完全版)
A D S弧段数字化系统。
A e r o p h o t o g r a p h航片A e r o p h o t o g r a p h i c a l S c a l e航空摄影比例尺A l l o c a t i o n在最大阻抗或资源容量范围内于网络终止拍到最近中心的弧段的过程。
A M/F M是英文A u t o m a t e d M a p p i n g/F a c i l i t i e s M a n a g e m e n t的缩写,是一种基于地理信息上的设备和生产技术管理的计算机图文交互系统,也是一种将图形技术与数据库管理技术相结合的计算机应用软件系统,采用A M/F M系统,能实现输配电网络系统的规划、建设、报装、调度、运行、检修和营业用电的计算机辅助管理,是目前在公共事业单位对分散设备(相对发电厂、钢厂等在地理上相对集中的集中设备而言)进行计算机辅助管理的先进、实用和理想的应用软件系统。
A M/F M系统是在地理信息系统(G I S)的基础上,根据设备工程管理的需要和生产技术管理的要求而开发的一种用于生产运行单位的新的信息管理系统,在很多场合也用A M/F M/G I S来代表A M/F M系统。
A n n o t a t i o n注释1.对图层特征物进行描述的文本,用来显示而不用于分析.2.在图层中用来标签其他特征物的一个特征类。
又见T A T。
A N S I美国国家标准组织是一个全国性的标准化协调组织。
中英文对照外文翻译(文档含英文原文和中文翻译)A Survey on Spatio-Temporal Data WarehousingAbstractGeographic Information Systems (GIS) have been extensively used in various application domains, ranging from economical, ecological and demographic analysis,to city and route planning. Nowadays, organizations need sophisticated GIS-based Decision Support System (DSS) to analyze their data with respect to geographic information, represented not only as attribute data, but also in maps. Thus, vendors are increasingly integrating their products, leading to the concept of SOLAP (Spatial OLAP). Also, in the last years, and motivated by the explosive growth in the use of PDA devices, the field of moving object data has been receiving attention from the GIS community. However, not much has been done in providing moving object databases with OLAP functionality. In the first part of this paper we survey theSOLAP literature. We then move to Spatio-Temporal OLAP, in particular addressing the problem of trajectory analysis. We finally provide an in-depth comparative analysis between two proposals introduced in the context of the GeoPKDD EU project: the Hermes-MDC system,and Piet, a proposal for SOLAP and moving objects,developed at the University of Buenos Aires, Argentina.Keywords: GIS, OLAP, Data Warehousing, MovingObjects, Trajectories, AggregationINTRODUCTIONGeographic Information Systems (GIS) have been extensively used in various application domains, ranging from economical, ecological and demographic analysis, to city and route planning (Rigaux, Scholl, & V oisard, 2001; Worboys, 1995). Spatial information in a GIS is typically stored in different so-called thematic layers (also called themes). Information in themes can be stored in data structures according to different data models, the most usual ones being the raster model and the vector model. In a thematic layer, spatial data is annotated with classical relational attribute information, of (in general) numeric or string type. While spatial data is stored in data structures suitable for these kinds of data, associated attributes are usually stored in conventional relational databases. Spatial data in the different thematic layers of a GIS system can be mapped univocally to each other using a common frame of reference, like a coordinate system.These layers can be overlapped or overlayed to obtain an integrated spatial view.On the other hand, OLAP (On Line Analytical Processing) (Kimball,1996; Kimball & Ross, 2002) comprises a set of tools and algorithms that allow efficiently querying multidimensional databases, containing large amounts of data, usually called Data Warehouses. In OLAP, data is organized as a set of dimensions and fact tables. In the multidimensional model, data can be perceived as a data cube, where each cell contains a measure or set of (probably aggregated) measures of interest. As we discuss later, OLAP dimensions are further organized in hierarchies that favor the data aggregation process (Cabibbo & Torlone, 1997). Several techniques and algorithms have been developed for query processing, most of them involving some kind of aggregate precomputation (Harinarayan, Rajaraman, & Ullman, 1996).The need for OLAP in GISDifferent data models have been proposed for representing objects in a GIS. ESRI () first introduced the Coverage data model to bind geometric objects to non-spatial attributes that describe them. Later, they extended this model with object-oriented support, in a way that behavior can be defined for geographic features (Zeiler,1999). The idea of the Coverage data model is also supported by the Reference Model proposed by the Open Geospatial Consortium (). Thus, in spite of the model of choice,there is always the underlying idea of binding geometric objects to objects or attributes stored in (mostly) object-relational databases (Stonebraker & Moore, 1996). In addition, query tools in commercial GIS allow users to overlap several thematic layers in order to locate objects of interest within an area, like schools or fire stations.For this, they use indexing structures based on R-trees (Gutman, 1984).GIS query support sometimes includes aggregation of geographic measures, for example, distances or areas (e.g., representing different geological zones). However, these aggregations are not the only ones that are required, as we discuss below.Nowadays, organizations need sophisticated GIS-based Decision Support System (DSS) to analyze their data with respect to geographic information, represented not only as attribute data, but also in maps, probably in different thematic layers. In this sense, OLAP and GIS vendors are increasingly integrating their products (see, for instance,Microstrategy and MapInfo integration in /, and /). In this sense, aggregate queries are central to DSSs. Classical aggregate OLAP queries (like “total sales of cars in California”), and aggregation combined with complex queries involving geometric components (“total sales in all villages crossed by the Mississippi river and within a radius of 100 km around New Orleans”) must be efficiently supported. Moreover, navigation of the results using typical OLAP operations like roll-up or drill-down is also required. These operations are not supported by commercial GIS in a straightforward way. One of the reasons is that the GIS data models discussed above were developed with “transactional” queries in mind. Thus, the databases storing nonspatial attributes or objects are designed to support those (nonaggregate) kinds of queries. Decision support systems need a different data model, where non-spatial data, probably consolidated from different sectors in an organization, is stored in a data warehouse. Here,numerical data are stored in fact tables built along several dimensions.For instance, if we are interested in the sales of certain products in stores in a given region, we may consider the sales amounts in a fact table over the three dimensions Store, Time and Product. In order to guarantee summarizability (Lenz & Shoshani, 1997), dimensions are organized into aggregation hierarchies. For example, stores can aggregate over cities which in turn can aggregate into regions and countries. Each of these aggregation levels can also hold descriptive attributes like city population, the area of a region, etc. To fulfill the requirements of integrated GIS-DSS, warehouse data must be linked to geographic data. For instance, a polygon representing a region must be associated to the region identifier in the warehouse. Besides, system integration in commercial GIS is not an easy task. In the current commercial applications, the GIS and OLAP worlds are integrated in an ad-hoc fashion, probably in a different way (and using different data models) each time an implementation is required, even when a data warehouse is available for non-spatial data.An Introductory Example. We present now a real-world example for illustrating some issues in the spatial warehousing problematic. We selected four layers with geographic and geological features obtained from the National Atlas Website (). Theselayers contain the following information: states, cities, and rivers in North America, and volcanoes in the northern hemisphere (published by the Global V olcanism Program - GVP). Figure 1 shows a detail of the layers containing cities and rivers in North America, displayed using the graphic interface of the Piet implementation we discuss later in the paper. Note the density of the points representing cities (particularly in the eastern region). Rivers are represented as polylines. Figure 2 shows a portion of two overlayed layerscontaining states (represented as polygons) and volcanoes in the northern hemisphere.There is also non-spatial information stored in a conventional data warehouse. In this data warehouse, dimension tables contain customer,stores and product information, and a fact table contains stores sales across time. Also, numerical and textual information on the geographic components exist (e.g., population, area), stored as usual as attributes of the GIS layers.In the scenario above, conventional GIS and organizational data can be integrated for decision support analysis. Sales information could be analyzed in the light of geographical features, conveniently displayed in maps. This analysis could benefit from the integration of both worlds in a single framework. Even though this integration could be possible with existing technologies, ad-hoc solutions are expensive because,besides requiring lots of complex coding, they are hardly portable. To make things more difficult, ad-hoc solutions require data exchange between GIS and OLAP applications to be performed. This implies that the output of a GIS query must be probably exported as members in dimensions of a data cube, and merged for further analysis. For example, suppose that a business analyst is interested in studying the sales of nautical goods in stores located in cities crossed by rivers. She could first query the GIS, to obtain the cities of interest. She probably has stored sales in a data cube containing a dimension Store or Geography with city as a dimension level. She would need to“manually” select the cities of interest (i.e., the ones returned by the GIS query) in the cube, to be able to go on with the analysis (in the best case, an ad-hoc customized middleware could help her). Of course, she must repeat this for each query involving a (geographic) dimension inthe data cube.Figure 1. Two overlayed layers containing cities and rivers in North America.On the contrary, GIS/Data warehousing integration can provide a more natural solution. The second part of this survey is devoted to spatio-temporal datawarehousing and OLAP. Moving objects databases (MOD) have been receiving increasing attention from the database community in recent years, mainly due to the wide variety of applications that technology allows nowadays. Trajectories of moving objects like cars or pedestrians, can be reconstructed by means of samples describing the locations of these objects at certain points in time. Although thereFigure 2. Two overlayed layers containing states in North America and volcanoes in thenorthern hemisphere.exist many proposals for modeling and querying moving objects, only a small part of them address the problem of aggregation of moving objects data in a GIS (Geographic Information Systems) scenario. Many interesting applications arise, involving moving objects aggregation, mainly regarding traffic analysis, truck fleet behavior analysis, commuter traffic in a city, passenger traffic in an airport, or shopping behavior in a mall. Building trajectory data warehouses that can integrate with a GIS is an open problem that is starting to attract database researchers. Finally, the MOD setting is appropriate for data mining tasks, and we also comment on this in the paper. In this paper, we first provide a brief background on GIS, data warehousing and OLAP, and a review of the state-of-the-art in spatial OLAP. After this, we move on to study spatio-temporal data warehousing, OLAP and mining. We then provide a detailed analysis of the Piet framework, aimed at integrating GIS, OLAP and moving object data, and conclude with a comparison between this proposal, and the Hermes data cartrridge and trajectory datawarehouse developed in the context of the GeoPKDD project (Information about the GoePKDD project can be found at http://www.geopkdd.eu).A SHORT BACKGROUNDGISIn general, information in a GIS application is divided over several thematic layers. The information in each layer consists of purely spatial data on the one hand, that is combined with classical alpha-numeric attribute data on the other hand (usually stored in a relational database). Two main data models are used for the representation of the spatial part of the information within one layer, the vector model and the raster model. The choice of model typically depends on the data source from which the information is imported into the GIS.The Vector Model. The vector model is used the most in current GIS (Kuper & Scholl, 2000). In the vector model, infinite sets of points in space are represented as finite geometric structures, or geometries, like, for example, points, polylines and polygons. More concretely, vector data within a layer consists in a finite number of tuples of the form (geometry, attributes) where a geometry can be a point, a polyline or a polygon. There are several possible data structures to actually store these geometries (Worboys, 1995).The Raster Model. In the raster model, the space is sampled into pixels or cells, each one having an associated attribute or set of attributes. Usually, these cells form a uniform grid in the plane. For each cell or pixel, the sample value of some function is computed and associated to the cell as an attribute value, e.g., a numeric value or a color. In general, information represented in the raster model is organized intozones, where the cells of a zone have the same value for some attribute(s). The raster model has very efficient indexing structures and it is very well-suited to model continuous change but its disadvantages include its size and the cost of computing the zones.Spatial information in the different thematic layers in a GIS is often joined or overlayed. Queries requiring map overlay are more difficult to compute in the vector model than in the raster model. On the other hand, the vector model offers a concise representation of the data, independent on the resolution. For a uniform treatment of different layers given in the vector or the raster model, in this paper we treat the raster model as a special case of the vector model. Indeed, conceptually, each cell is, and each pixel can be regarded as, a small polygon; also, the attribute value associated to the cell or pixel can be regarded as an attribute in the vector model.Data Warehousing and OLAPThe importance of data analysis has increased significantly in recent years as organizations in all sectors are required to improve their decision-making processes in order to maintain their competitive advantage. We said before that OLAP (On Line Analytical Processing) (Kimball, 1996; Kimball & Ross, 2002) comprises a set of tools and algorithms that allow efficiently querying databases that contain large amounts of data. These databases, usually designed for read-only access (in general, updating isperformed off-line), are denoted data warehouses. Data warehouses are exploited in different ways. OLAP is one of them. OLAP systems are based on a multidimensional model, which allows a better understanding of data for analysis purposes and provides better performance for complex analytical queries. The multidimensional model allows viewing data in an n-dimensional space, usually called a data cube (Kimball & Ross,2002). In this cube, each cell contains a measure or set of (probably aggregated) measures of interest. This factual data can be analyzed along dimensions of interest, usually organized in hierarchies (Cabibbo & Torlone, 1997). Three typical ways of OLAP tools implementation exist: MOLAP (standing for multidimensional OLAP), where data is stored in proprietary multidimensional structures, ROLAP (relational OLAP), where data is stored in (object) relational databases, and HOLAP (standing for hybrid OLAP, which provides both solutions. In a ROLAP environment, data is organized as a set of dimension tables and fact tables, and we assume this organization in the remainder of the paper.There are a number of OLAP operations that allow exploiting the dimensions and their hierarchies, thus providing an interactive data analysis environment. Warehouse databases are optimized for OLAP operations which, typically, imply data aggregation or de-aggregation along a dimension, called roll-up and drill-down, respectively. Other operations involve selecting parts of a cube (slice and dice) and reorienting the multidimensional view of data (pivoting). In addition to the basic operations described above, OLAP tools provide a great variety of mathematical, statistical, and financial operators for computing ratios, variances, ranks,etc.It is an accepted fact that data warehouse (conceptual) design is still an open issue in the field (Rizzi & Golfarelli, 2000). Most of the data models either provide a graphical representation based on the Entity- Relationship (E/R) model or UML notations, or they just provide some formal definitions without user-oriented graphical support. Recently, Malinowsky and Zimányi (2006) propose the MultiDim model. This model is based on the E/R model and provides an intuitive graphical notation. Also recently, Vaisman (Vaisman, 2006a, 2006b) introduced a methodology for requirement elicitation in Decision Support Systems, arguing that methodologies used for OLTP systems are not appropriate for OLAP systems.Temporal Data WarehousesThe relational data model as proposed by Codd (1970), is not wellsuited for handling spatial and/or temporal data. Data evolution over time must be treated in this model, in the same way as ordinary data. This is not enough for applications that require past, present, and/or future data values to be dealt with by the database. In real life such applications abound. Therefore, in the last decades, much research has been done in the field of temporal databases. Snodgrass (1995) describes the design of the TSQL2 Temporal Query Language, an upward compatible extension of SQL-92. The book, written as a result of a Dagstuhl seminar organized in June 1997 by Etzion, Jajodia, andSripada (1998), contains comprehensive bibliography, glossaries for both temporal database and time granularity concepts, and summaries of work around 1998. The same author (Snodgrass, 1999), in other work, discusses practical research issues on temporal database design and implementation.Regarding temporal data warehousing and OLAP, Mendelzon and Vaisman (2000, 2003) proposed a model, denoted TOLAP, and developed a prototype and a datalog-like query language, based on a (temporal) star schema. Vaisman, Izquierdo, and Ktenas (2006) also present a Web-based implementation of this model, along with a query language, called TOLAP-QL. Eder, Koncilia, and Morzy (2002) also propose a data model for temporal OLAP supporting structural changes. Although these efforts, little attention has been devoted to the problem of conceptual and logical modeling for temporal data warehouses. SPATIAL DATA WAREHOUSING AND OLAPSpatial database systems have been studied for a long time (Buchmann, Günther, Smith, & Wang, 1990; Paredaens, Van Den Bussche, & Gucht, 1994). Rigaux et al. (2001) survey various techniques, such as spatial data models, algorithms, and indexing methods, developed to address specific features of spatial data that are not adequately handled by mainstream DBMS technology.Although some authors have pointed out the benefits of combining GIS and OLAP, not much work has been done in this field. Vega López,Snodgrass, and Moon (2005) present a comprehensive survey on spatiotemporal aggregation that includes a section on spatial aggregation. Also, Bédard, Rivest, and Proulx (2007) present a review of the efforts for integrating OLAP and GIS. As we explain later, efficient data aggregation is crucial for a system with GIS-OLAP capabilities.Conceptual Modeling and SOLAPRivest, Bédard, and Marchand (2001) introduced the concept of SOLAP (standing for Spatial OLAP), a paradigm aimed at being able to explore spatial data by drilling on maps, in a way analogous to what is performed in OLAP with tables and charts. They describe the desirable features and operators a SOLAP system should have.Although they do not present a formal model for this, SOLAP concepts and operators have been implemented in a commercial tool called JMAP, developed by the Centre for Research in Geomatics and KHEOPS, see /en/jmap/solap.jsp. Stefanovic, Han, and Koperski (2000) and Bédard, Merret, and Han (2001), classify spatial dimension hierarchies according to their spatial references in: (a) non-geometric;(b) geometric to non-geometric; and (c) fully geometric. Dimensions of type (a) can be treated as any descriptive dimension (Rivest et al., 2001). In dimensions of types (b) and (c), a geometry is associated to members of the hierarchies. Malinowski and Zimányi (2004) extend this classification to consider that even in the absence of several related spatial levels, a dimension can be considered spatial. Here, a dimension level is spatial if it is represented as a spatial data type (e.g., point, region), allowing them to link spatial levels through topological relationships (e.g., contains, overlaps). Thus, a spatial dimension is a dimension that contains at least one spatial hierarchy. A critical point inspatial dimension modeling is the problem of multiple-dependencies, meaning that an element in one level can be related to more than one element in a level above it in the hierarchy. Jensen, Kligys, Pedersen, and Timko (2004)address this issue, and propose a multidimensional data model for mobile services, i.e., services that deliver content to users, depending on their location.This model supports different kinds of dimension hierarchies, most remarkably multiple hierarchies in the same dimension, i.e., multiple aggregation paths. Full and partial containment hierarchies are also supported. However, the model does not consider the geometry, limiting the set of queries that can be addressed. This means that spatial dimensions are standard dimensions referring to some geographical element (like cities or roads).Malinowski and Zimányi (2006) also propose a model supporting multiple aggregation paths. Pourabbas (2003) introduces a conceptual model that uses binding attributes to bridge the gap between spatial databases and a data cube. The approach relies on the assumption that all the cells in the cube contain a value, which is not the usual case in practice, as the author expresses. Also, the approach requires modifying the structure of the spatial data to support the model. No implementation is presented.Shekhar, Lu, Tan, Chawla, & Vatsavai (2001) introduced MapCube, a visualization tool for spatial data cubes. MapCube is an operator that, given a so-called base map, cartographic preferences and an aggregation hierarchy, produces an album of maps that can be navigated via roll-up and drill-down operations.Spatial Measures. Measures are characterized in two ways in the literature, namely: (a) measures representing a geometry, which can be aggregated along the dimensions; (b) a numerical value, using a topological or metric operator. Most proposals support option (a), either as a set of coordinates (Bédard et al., 2001; Rivest et al., 2001; Malinowski & Zimányi, 2004; Bimonte, Tchounikine, & Miquel, 2005), or a set of pointers to geometric objects (Stefanovic et al., 2000). Bimonte et al. (Bimonte et al., 2005) define measures as complex objects (a measure is thus an object containing several attributes). Malinowski and Zimányi (2004) follow a similar approach, but defining measures as attributes of an n-ary fact relationship between dimensions.Damiani and Spaccapietra (2006) propose MuSD, a model allowing defining spatial measures at different granularities. Here, a spatial measure can represent the location of a fact at multiple levels of (spatial) granularity. Also, an algebra of SOLAP operators is proposed.Spatial AggregationIn light of the discussion above, it should be clear that aggregation is a crucial issue in spatial OLAP. Moreover, there is not yet a consensus about a complete set of aggregate operators for spatial OLAP. We now discuss the classic approaches to spatial aggregation. Han et al. (1998) use OLAP techniques for materializing selected spatial objects, and proposed a so-called Spatial Data Cube, and the set of operations that can be performed on this data cube. The model only supports aggregation of spatial objects.Pedersen and Tryfona (2001) propose the pre-aggregation of spatial facts. First, they pre-process these facts, computing their disjoint parts in order to be able to aggregate them later. This pre-aggregation works if the spatial properties of the objects are distributive over some aggregate function. Again, the spatial measures are geometric objects.Given that this proposal ignores the geometries, queries like “total population of cities crossed by a river” are not supported. The paper does not address forms other than polygons, although the authors claim that other more complex forms are supported by the method, and the authors do not report experimental results.With a different approach, Rao, Zhang, Yu, Li, and Chen (2003), and Zhang, Li, Rao, Yu, Chen, and Liu (2003) combine OLAP and GIS for querying so-called spatial data warehouses, using R-trees for accessing data in fact tables. The data warehouse is then exploited in the usualOLAP way. Thus, they take advantage of OLAP hierarchies for locating information in the R-tree which indexes the fact table.Although the measures here are not only spatial objects, the proposal also ignores the geometric part of the model, limiting the scope of the queries that can be addressed. It is assumed that some fact table, containing the identifiers of spatial objects exists. Finally, these objects happen to be points, which is quite unrealistic in a GIS environment, where different types of objects appear in the different layers. Some interesting techniques have been recently introduced to address the data aggregation problem. These techniques are based on the combined use of (R-tree-based) indexes, materialization (or preaggregation) of aggregate measures, and computational geometry algorithms.Papadias, Tao, Kalnis, and Zhang (2002) introduce the Aggregation Rtree (aR-tree), combining indexing with pre-aggregation. The aR-tree is an R-tree that annotates each MBR (Minimal Bounding Rectangle) with the value of the aggregate function for all the objects that are enclosed by it. They extend this proposal in order to handle historic information (see the section on moving object data below), denoting this extension aRB-tree (Papadias, Tao, Zhang, Mamoulis, Shen, and & Sun, 2002). The approach basically consists in two kinds of indexes: a host index, which is an R-tree with the summarized information, and a B-tree containing time-varying aggregate data. In the most general case, each region has a B-tree associated, with the historical information of the measures of interest in the region. This is a very efficient solution for some kinds of queries, for example, window aggregate queries (i.e., for the computation of the aggregate measure of the regions which intersect a spatio-temporal window). In addition, the method is very effective when a query is posed over a query region whose intersection with the objects in a map must be computed on-thefly,and these objects are totally enclosed in the query region. However, problems may appear when leaf entries partially overlap the query window. In this case, the result must be estimated, or the actual results computed using the base tables. In fact, Tao, Kollios, Considine, Li,and Papadias (2004), show that the aRB-tree can suffer from the distinct counting problem, if the object remains in the same region for several timestamps.时空数据仓库的调查摘要地理信息系统已被广泛应用于不同的应用领域,包括经济,生态和人口统计分析,城市和路线规划。
KEYWORDS: Scale, Geographic Information System , Remote Sensing, Spatial ResolutionINTRODUCTION: Scale is a heavily overloaded term in English, with abundant definitions attributable to many different and often independent roots, such that meaning is strongly dependent on context. Its meanings in “the scales of justice” or “scales over ones eyes” have little connection to each other, or to its meaning in a discussion of remote sensing and GIS. But meaning is often ambiguous even in that latter context. For example, scale to a cartographer most likely relates to the representative fraction, or the scaling ratio between the real world and a map representation on a flat, two-dimensional surface such as paper, whereas scale to an environmental scientist likely relates either tospatial resolution (the representatio n‟s level of spatial detail) or to spatial extent (the representation‟s spatial coverage). As a result, a simple phrase like “large scale” can send quite the wrong message when communities and disciplines interact - to a cartographer it implies fine detail, whereas to an environmental scientist it implies coarse detail. A computer scientist might say that in this respect the two disciplines were not interoperable.In this paper I examine the current meanings of scale, with particular reference to the digital world, and the metrics associated with each meaning. The concern throughout is with spatial meanings, although temporal and spectral meanings are also important. I suggest that certain metrics survive the transition to digital technology better than others.The main purpose of this paper is to propose a dimensionless ratio of two such metrics that appears to have interesting and useful properties. I show how this ratio is relevant to a specific vision for the future of geographic information technologies termed Digital Earth. Finally, I discuss how scale might be defined in ways that are accessible to a much wider range of users than cartographers and environmental scientists.FOUR MEANINGS OF SCALE LEVEL OF SPATIAL DETAIL REPRESENTATIVE FRACTIONA paper map is an analog representation of geographic variation, rather than a digital representation. All features on the Earth‟s surface are scaled using an approximately uniform ratio known as the representative fraction (it is impossible to use a perfectly unif orm ratio because of the curvature of the Earth‟s surface). The power of the representative fraction stems from the many different properties that are related to it in mapping practice. First, paper maps impose an effective limit on the positional accuracy of features, because of instability in the material used to make maps, limited ability to control the location of the pen as the map is drawn, and many other practicalconsiderations. Because positional accuracy on the map is limited, effective positional accuracy on the ground is determined by the representative fraction. A typical (and comparatively generous) map accuracy standard is 0.5 mm, and thus positional accuracy is 0.5 mm divided by the representative fraction (eg, 12.5 m for a map at 1:25,000). Second, practical limits on the widths of lines and the sizes of symbols create a similar link between spatial resolution and representative fraction: it is difficult to show features much less than 0.5 mm across with adequate clarity. Finally, representative fraction serves as a surrogate for the features depicted on maps, in part because of this limit to spatial resolution, and in part because of the formal specifications adopted by mapping agencies, that are in turn related to spatial resolution. In summary, representative fraction characterizes many important properties of paper maps.In the digital world these multiple associations are not necessarily linked. Features can be represented as points or lines, so the physical limitations to the minimum sizes of symbols that are characteristic of paper maps no longer apply. For example, a database may contain some features associated with 1:25,000 map specifications, but not all; and may include representations of features smaller than 12.5 m on the ground. Positional accuracy is also no longer necessarily tied to representative fraction, since points can be located to any precision, up to the limits imposed by internal representations of numbers (eg, single precision is limited to roughly 7 significant digits, double precision to 15). Thus the three properties that were conveniently summarized by representative fraction - positional accuracy, spatial resolution, and feature content - are now potentially independent.Unfortunately this has led to a complex system of conventions in an effort to preserve representative fraction as a universal defining characteristic of digital databases. When such databases are created directly from paper maps, by digitizing or scanning, itis possible for all three properties to remain correlated. But in other cases the representative fraction cited for a digital database is the one implied by its positional accuracy (eg, a database has representative fraction 1: 12,000 because its positional accuracy is 6 m); and in other cases it is the feature content or spatial resolution that defines the conventional representative fraction (eg, a database has representative fraction 1:12,000 because features at least 6 m across are included). Moreover, these conventions are typically not understood by novice users - the general public, or children - who may consequently be very confused by the use of a fraction to characterize spatial data, despite its familiarity to specialists.SPATIAL EXTENTThe term scale is often used to refer to the extent or scope of a study or project, and spatial extent is an obvious metric. It can be defined in area measure, but for the purposes of this discussion a length measure is preferred, and the symbol L will be used. For a square project area it can be set to the width of the area, but for rectangular or oddly shaped project areas the square root of area provides a convenient metric. Spatial extent defines the total amount of information relevant to a project, which rises with the square of a length measure.PROCESS SCALEThe term process refers here to a computational model or representation of a landscape-modifying process, such as erosion or runoff. From a computational perspective,a process is a transformation that takes a landscape from its existing state to some new state, and in this sense processes are a subset of the entire range of transformations that can be applied to spatial data.Define a process as a mapping b (x ,2t )=f ( a (x ,1t )) where a is a vector of input fields, b is a vector of output fields, f is a function, t is time, 2t is later in time thant, and x denotes location. Processes vary according to how they modify the spatial 1characteristics of their inputs, and these are best expressed in terms of contributions tot) based only on the the spatial spectrum. For example, some processes determine b(x, ,2t), and thus have minimal effect on spatial spectra. inputs at the same location a(x,1Other processes produce outputs that are smoother than their inputs, through processes of averaging or convolution, and thus act as low-pass filters. Less commonly, processes produce outputs that are more rugged than their inputs, by sharpening rather than smoothing gradients, and thus act as high-pass filters.The scale of a process can be defined by examining the effects of spectral components on outputs. If some wavelength s exists such that components with wavelengths shorter than s have negligible influence on outputs, then the process is said to have a scale of s. It follows that if s is less than the spatial resolution S of the input data, the process will not be accurately modeled.While these conclusions have been expressed in terms of spectra, it is also possible to interpret them in terms of variograms and correlograms. A low-pass filter reduces variance over short distances, relative to variance over long distances. Thus the short-distance part of the variogram is lowered, and the short-distance part of the correlogram is increased. Similarly a high-pass filter increases variance over short distances relative to variance over long distances.L/S RATIOWhile scaling ratios make sense for analog representations, the representative fraction is clearly problematic for digital representations. But spatial resolution and spatial extent both appear to be meaningful in both analog and digital contexts, despite the problems with spatial resolution for vector data. Both Sand L have dimensions oflength, so their ratio is dimensionless. Dimensionless ratios often play a fundamental role in science (eg, the Reynolds number in hydrodynamics), so it is possible that L/S might play a fundamental role in geographic information science. In this section I examine some instances of the L/S ratio, and possible interpretations that provide support for this speculation.- Today‟s computing industry seems to have settled on a screen standard of order 1 megapel, or 1 million picture elements. The first PCs had much coarser resolutions (eg, the CGA standard of the early 198Os), but improvements in display technology led to a series of more and more detailed standards. Today, however, there is little evidence of pressure to improve resolution further, and the industry seems to be content with an L/S ratio of order 103. Similar ratios characterize the current digital camera industry, although professional systems can be found with ratios as high as 4,000.- Remote sensing instruments use a range of spatial resolutions, from the 1 m of IKONOS to the 1 km of AVHRR. Because a complete coverage of the Earth‟s surface at 1 m requires on the order of 1015 pixels, data are commonly handled in more manageable tiles, or approximately rectangular arrays of cells. For years, Landsat TM imagery has been tiled in arrays of approximately 3,000 cells x 3,000 cells, for an L/S ratio of 3,000.- The value of S for a paper map is determined by the technology of map-making, and techniques of symbolization, and a value of 0.5 mm is not atypical. A map sheet 1 m across thus achieves an L/S ratio of 2,000.- Finally, the human eye‟s S can be defined as the size of a retinal cell, and the typical eye has order 108 retinal cells, implying an L/S ratio of 10,000. Interestingly, then, the screen resolution that users find generally satisfactory corresponds approximately to the parameters of the human visual system; it is somewhat larger, but the computer screentypically fills only a part of the visual field.These examples suggest that L/S ratios of between 103 and 104 are found across a wide range of technologies and settings, including the human eye. Two alternative explanations immediately suggest themselves: the narrow range may be the result of technological and economic constraints, and thus may expand as technology advances and becomes cheaper; or it may be due to cognitive constraints, and thus is likely to persist despite technological change.This tension between technological, economic, and cognitive constraints is well illustrated by the case of paper maps, which evolved under what from today‟s perspective were severe technological and economic constraints. For example, there are limits to the stability of paper and to the kinds of markings that can be made by hand-held pens. The costs of printing drop dramatically with the number of copies printed, because of strong economies of scale in the printing process, so maps must satisfy many users to be economically feasible. Goodchild [2000]has elaborated on these arguments. At the same time, maps serve cognitive purposes, and must be designed to convey information as effectively as possible. Any aspect of map design and production can thus be given two alternative interpretations: one, that it results from technological and economic constraints, and the other, that it results from the satisfaction of cognitive objectives. If the former is true, then changes in technologymay lead to changes in design and production; but if the latter is true, changes in technology may have no impact.The persistent narrow range of L/S from paper maps to digital databases to the human eye suggests an interesting speculation: That cognitive, not technological or economic objectives, confine L/S to this range. From this perspective, L/S ratios of more than 104 have no additional cognitive value, while L/S ratios of less than 103 areperceived as too coarse for most purposes. If this speculation is true, it leads to some useful and general conclusions about the design of geographic information handling systems. In the next section I illustrate this by examining the concept of Digital Earth. For simplicity, the discussion centers on the log to base 10 of the L/S ratio, denoted by log L/S, and the speculation that its effective range is between 3 and 4.This speculation also suggests a simple explanation for the fact that scale is used to refer both to L and to S in environmental science, without hopelessly confusing the listener. At first sight it seems counter~ntuitive that the same term should be used for two independent properties. But if the value of log L/S is effectively fixed, then spatial resolution and extent are strongly correlated: a coarse spatial resolution implies a large extent, and a detailed spatial resolution implies a small extent. If so, then the same term is able to satisfy both needs.THE VISION OF DIGITAL EARTHThe term Digital Earth was coined in 1992 by U.S. Vice President Al Gore [Gore, 19921, but it was in a speech written for delivery in 1998 that Gore fully elaborated the concept (www.d~~Pl9980131 .html): “Imagine, for example, a young child going to a Digital Earth exhibit at a local museum. After donning a headmounted display, she sees Earth as it appears from space. Using a data glove, she zooms in, using higher and higher levels of resolution, to see continents, then regions, countries, cities, and finally individual houses, trees, and other natural and man-made objects. Having found an area of the planet she is interested in exploring, she takes the equivalent of a …magic carpet ride‟ through a 3- D visualization of the terrain.”This vision of Digital Earth (DE) is a sophisticated graphics system, linked to a comprehensive database containing representations of many classes of phenomena. It implies specialized hardware in the form of an immersive environment (a head-mounteddisplay), with software capable of rendering the Earth‟s surface at high speed, and from any perspective. Its spatial resolution ranges down to 1 m or finer. On the face of it, then, the vision suggests data requirements and bandwidths that are well beyond today‟s capabilities. If each pixel of a 1 m resolution representation of the Earth‟s surface was allocated an average of 1 byte then a total of 1 Pb of storage would be required; storage of multiple themes could push this total much higher. In order to zoom smoothly down to 1 m it would be necessary to store the data in a consistent data structure that could be accessed at many levels of resolution. Many data types are not obviously renderable (eg, health, demographic, and economic data), suggesting a need for extensive research on visual representation.The bandwidth requirements of the vision are perhaps the most daunting problem. To send 1 Pb of data at 1 Mb per second would take roughly a human life time, and over 12,000 years at 56 Kbps. Such requirements dwarf those of speech and even full-motion video. But these calculations assume that the DE user would want to see the entire Earth at Im resolution. The previ ous analysis of log L/S suggested that for cognitive (and possibly technological and economic) reasons user requirements rarely stray outside the range of 3 to 4, whereas a full Earth at 1 m resolution implies a log L/S of approximately 7. A log L/S of 3 suggests that a user interested in the entire Earth would be satisfied with 10 km resolution; a user interested in California might expect 1 km resolution; and a user interested in Santa Barbara County might expect 100 m resolution. Moreover, these resolutions need apply only to the center of the current field of view.On this basis the bandwidth requirements of DE become much more manageable. Assuming an average of 1 byte per pixel, a megapel image requires order 107 bps if refreshed once per second. Every one-unit reduction in log L/S results in two orders of magnitude reduction in bandwidth requirements. Thus a Tl connection seems sufficientto support DE, based on reasonable expectations about compression, and reasonable refresh rates. On this basis DE appears to be feasible with today‟s communication technology.CONCLUDING COMMENTSI have argued that scale has many meanings, only some of which are well defined for digital data, and therefore useful in the digital world in which we increasingly find ourselves. The practice of establishing conventions which allow the measures of an earlier technology - the paper map - to survive in the digital world is appropriate for specialists, but is likely to make it impossible for non-specialists to identify their needs. Instead, I suggest that two measures, identified here as the large measure L and the small measure S, be used to characterize the scale properties of geographic data.The vector-based representations do not suggest simple bases for defining 5, because their spatial resolutions are either variable or arbitrary. On the other hand spatial variat;on in S makes good sense in many situations. In social applications, it appears that the processes that characterize human behavior are capable of operating at different scales, depending on whether people act in the intensive pedestrian-oriented spaces of the inner city or the extensive car-oriented spaces of the suburbs. In environmental applications, variation in apparent spatial resolution may be a logical sampling response to a phenomenon that is known to have more rapid variation in some areas than others; from a geostatistical perspective this might suggest a non-stationary variogram or correlogram (for examples of non-statjonary geostatistical analysis see Atkinson [2001]). This may be one factor in the spatial distribution of weather observation networks (though others, such as uneven accessibility, and uneven need for information are also clearly important).The primary purpose of this paper has been to offer a speculation on the significance of the dimensionless ratio L/S. The ratio is the major determinant of datavolume, and consequently processing speed, in digital systems. It also has cognitive significance because it can be defined for the human visual system. I suggest that there are few reasons in practice why log L/S should fall outside the range 3 - 4, and that this provides an important basis for designing systems for handling geographic data. Digital Earth was introduced as one such system. A constrained ratio also implies that L and S are strongly correlated in practice, as suggested by the common use of the same term scale to refer to both.ACKNOWLEDGMENTThe Alexandria Digital Library and its Alexandria Digital Earth Prototype, the source of much of the inspiration for this paper, are supported by the U.S. National Science Foundation.REFERENCESAtkinson, P.M., 2001. Geographical information science: Geocomputation and nonstationarity. Progress in Physical Geography 25(l): 111-122.Goodchild, M F 2000 Communicating geographic information in a digital age. Annals of the Association of American Geographers 90(2): 344-355.Goodchild, M.F. & J. Proctor, 1997. Scale in a digital geographic world. Geographical and Environmental Modelling l(1): 5-23.Gore, A., 1992. Earth in the Balance: Ecology and the Human Spirit. Houghton Mifflin, Boston, 407~~.Lam, N-S & D. Quattrochi, 1992. On the issues of scale, resolution, and fractal analysis in the mapping sciences. Professional Geographer 44(l): 88-98.Quattrochi D.A & M.F. Goodchild (Eds), 1997. Scale in Remote Sensing and GIS.Lewis Publishers, Boca Raton, 406~~.中文翻译:在遥感和地理信息系统的规模度量迈克尔·F古德柴尔德(美国国家地理信息和分析中心,加州大学圣巴巴拉分校地理系)摘要:长期的规模有多种含义,其中一些生存了从模拟到数字表示的信息比别人更好的过渡。