Phototransduction as a model for signal transduction
M Paper
Plasmon-Enhanced Terahertz Photodetection in GrapheneXinghan Cai,†Andrei B.Sushkov,†Mohammad M.Jadidi,‡Luke O.Nyakiti,⊥Rachael L.Myers-Ward,¶D.Kurt Gaskill,¶Thomas E.Murphy,‡Michael S.Fuhrer,†,§and H.Dennis Drew *,††Center for Nanophysics and Advanced Materials,University of Maryland,College Park,Maryland 20742-4111,United States ‡Institute for Research in Electronics and Applied Physics,University of Maryland,College Park,Maryland 20742,United States §School of Physics,Monash University,Victoria 3800,Australia ⊥Texas A&M University,Galveston,Texas 77553,United States ¶U.S.Naval Research Laboratory,Washington,D.C.20375,United States*Supporting Information result in a variety of potential photonic applications such as optical modulators,4plasmonic devices,5−8and THz emitters.9Particularly promising is terahertz (THz)photo-detection,in which graphene devices may o ffer signi ficant advantages over existing technology in terms of speed and sensitivity.10−12Because of graphene ’s small electronic heat capacity and relatively large electron −electron relaxation rate compared to its electron −phonon relaxation rate,13,14hot electron e ffects are important in graphene even at room temperature and have been exploited to realize fast,sensitive THz detection via the photothermoelectric e ffect 10,15,16and bolometric e ffect.17,18However,a signi ficant challenge remains in increasing graphene ’s absorption.Graphene ’s interband absorption is a frequency-independent constant πα≈2.3%where αis the fine structure constant.19,20Owing to its zero band gap nature,doped graphene shows a relatively high DC conductivity,which results in a considerable Drude absorption (free carrier response)in the THz range.21,22However,the Drude absorption in graphene is strongly frequency dependent,decreasing as (ωτ)−2at high frequencies ω≫1/τwhere τis the scattering time,proportional to graphene ’s mobility and typically 10−100fs in graphene.Thus,the Drude absorption rolls o ffat lower frequencies in higher mobility (higher τ)graphene samples.A number of e fforts have been made to increase the absorption in graphene photodetectors.Quantum dots deposited on graphene can enhance the light-matter interaction;23however,the approach is likely limited to thevisible or near-infrared where the interband absorption of the quantum dot lies,and the response times are slow.Locating the detector in a microcavity,which resonates at selected frequency,can enhance absorption,but to date this has been demonstrated only at near-infrared wavelengths 24and would be cumbersome for long wavelength THz radiation.Coupling the detector to an antenna is a viable approach for frequencies up to the low THz,but there are few demonstrations of antenna-coupled graphene devices,25and the approach is applicable only to devices whose size is much smaller than the wavelength.In contrast to these approaches,plasmon resonances in finite-width graphene can provide a strong absorption that has a fast response (set by the thermal relaxation time 10),is tunable over a broad range of frequencies in the THz through changing either the con finement size or the carrier density,26,27and is more amenable to fabrication of arrays for large-area detectors compared to antenna-coupled devices.Here we demonstrate a room temperature THz detector based on large area arrays of epitaxial graphene microribbons contacted by metal electrodes,whose responsivity is signi fi-cantly improved by the plasmon enhanced absorption.We show that if the opposing edges of the microribbons are directly contacted by metal electrodes,the altered boundary conditions at the graphene −metal interface and associated currents in the metal 28,29make it di fficult to directly excite plasmonReceived:January 13,2015Revised:March 18,2015resonances.In contrast,if the ribbons are oriented perpendicular to the metal electrodes,then the subwavelength metal electrode pattern re flects the incident wave with the necessary polarization perpendicular to the ribbons and parallel to the electrodes,greatly reducing the plasmonic excitation.We therefore adopt a novel geometry of graphene micro-ribbons tilted at an angle with respect to the electrode array,in which the plasmon mode associated with currents transverse to the ribbon can be e fficiently excited by light polarized perpendicular to the metal electrodes.By using dissimilar metal electrodes,we form a photothermoelectric detector from our tilted graphene microribbon array.We observe an enhanced photovoltage at room temperature when the carrier density of graphene is tuned such that the plasmon resonance frequency matches the THz continuous-wave excitation.The frequency and polarization-angle dependent absorption and the gate voltage and polarization-angle dependent photoresponse are well described by a simple plasmonic conductivity model for graphene.Plasmon resonances in graphene have been previously studied in exfoliated graphene samples by using infrared nanoimaging 6,30and by Fourier transform infrared spectrosco-py (FTIR)in arrays of microribbons or disks patterned from large-area chemical vapor deposition-grown graphene.7,31The plasmon dispersion relation for graphene 7,26,27is given byεεπωωτ=ℏ++q e v n i ()4(/)121/22F 1/2(1)where ε1,2is the dielectric constant of the media above/below graphene,n is the charge carrier density in graphene,v F =106m/s is graphene ’s Fermi velocity,ℏis Planck ’s constant,and e is the elementary charge.We expect that a graphene ribbon of width W will determine the plasmon wavevector q such thatπδ=−q N W(2)where N is the harmonic order of the plasmonic mode,and δis a phase shift upon re flection at the graphene edge.Numerical results indicate that δ=π/4for termination by dielectric.32,33Then we have for the plasmon resonance frequencyωπεε=ℏ+⎛⎝⎜⎜⎞⎠⎟⎟v e n W 3()p 3/2F2121/21/41/2(3)For graphene on SiC (ε1∼9.6)with PEO electrolyte top gate(ε2∼3),the plasmon frequency f p =ωp /2π=2.73THz ×[n (1012cm −2)]1/4×[W (μm)]−1/2.Here we show the first observation of such standing wave plasmons in monolayer epitaxial graphene on SiC (0001)substrates.We patterned large area graphene on SiCsubstrateFigure 1.Attenuation spectra for (a −c)a graphene ribbon array with no metal electrodes,(d −f)a graphene ribbon array oriented orthogonal to a metal electrode grating,and (g,h)a graphene ribbon array tilted 45°with respect to a metal electrode grating.Optical micrographs of the devices are shown in panels a,d,and g.The insets show the corresponding schematics,respectively.Attenuation spectra at di fferent gate voltages V g are shown in panels b,c,e,f,and h.In panels c,e,f,and h,spectra are normalized by the spectrum at V g =V g,min ,and in panel b at V g =V g,min +2.2V.The incident electric field is polarized parallel to the graphene ribbons in panels b and e and perpendicular to the graphene ribbons in panels c and f.In panel h,the incident electric field polarization is at 45°to the graphene ribbons and perpendicular to the metal grating.(i)Plasmonic resonance frequency f p as a function of carrier density n for the device shown in panels a −c.Black points are extracted from fits of the data in panel c as described in text.Fits to data in panel c are shown as solid lines in inset.Red line:fit to eq 3in text.into a microribbon arrays using standard electron beam lithography(see Methods).Figure1,panel a shows the optical micrograph of the sample with patterned electron-beam resist on top,which is used as a mask to etch graphene underneath (the graphene on SiC is not easily visible in optical microscopy).The total array size is2mm×2mm,the ribbon width is2.3μm,and the period of the array is4.6μm.The response of the device to THz excitation is characterized by FTIR(see Methods).The attenuation spectra with the excitation polarized parallel and perpendicular to the ribbon are plotted in Figure1,panels b and c,respectively.In our experiment,the attenuation A is defined as A=1−(T(V g)/ T(V g,min))=ΔT/T(V g,min),where T(V g)is the transmission when the applied gate voltage is V g,and T(V g,min)is the transmission at the charge neutral point.The carrier density in graphene is tuned by applying the gate voltage V g through an electrolyte top gate[LiClO4+PEO(poly(ethylene oxide))]. Note that the spectra are normalized to the transmission at V g,min in Figure1,panel c and to the transmission at V g=V g,min +2.2V in Figure1,panel b.Here,we take the spectrum which corresponds to the lowest carrier density of graphene achieved in each data set as the reference spectrum for normalization.As shown in Figure1,panel b,a Drude response is observed, where the attenuation decreases monotonically with the frequency.A completely different line shape is seen for the attenuation spectra in Figure1,panel c,when the incident light is polarized perpendicular to the ribbons,where we see enhanced absorption associated with excitation of the intrinsic plasmon.In this device,where the ribbon width isfixed,a blue shift of f p is observed when increasing n by raising the gate voltage.We modeled the spectra shown in Figure1,panel c using a simple plasmonic conductivity model,with f p and n asfit parameters and assuming a constantμ=1300cm2V−1s−1(see Methods). We then plot the modeled f p versus n with afit to eq3,which gives f p=1.92THz×[n(1012cm−2)]1/4.The prefactor1.92is very close to the expected value of1.80found from eq3with W =2.3μm.The inset of Figure1,panel(i)shows the individual fits to selected curves from Figure1,panel c.To be used as a photodetector,graphene elements need to be connected via a conductive material to form a closed electrical circuit.Additionally,we expect that detectors exploiting hot electron effects will require electrode spacings comparable to the diffusion length of electrons due to electron−phonon scattering,expected to be less than1μm, far smaller than the THz wavelength in free space(∼100μm).34Figure1,panel d shows a graphene microribbon array, similar to that in Figure1,panel a that is contacted by a perpendicular array of metal electrodes.The vertical graphene ribbons,faintly visible in Figure1,panel d,are0.6μm wide with a period of2μm.The horizontal chromium/gold(4nm/ 45nm)electrodes were patterned on top of the graphene ribbons with an electrode width1.7μm and period of9μm. Figure1,panels e and f show the measured attenuation spectra for two polarization cases.When the incident THz signal is polarized parallel to the microribbons,a Drude-like response is shown in Figure1,panel e similar to Figure1,panel b.For polarization perpendicular to the ribbons,a plasmon resonance is observed in Figure1,panel f similar to Figure1,panel c. Because the ribbons are about four-times narrower,the resonant frequency is higher by a factor of about two. Additionally,by comparing Figure1,panels c and f,wefind that the strength of the plasmon resonance is reduced in the metal-contacted case and is smaller than the strength of the resonance for the uncontacted ribbons.This is a consequence of the subwavelength metal grating that is a good reflector for radiation polarized parallel to the grating wires.The extinction coefficient of metal wire gratings scales in proportion to(d/λ)2 at long wavelengths.This is a significant disadvantage of this scheme,since we expect that detectors will require even smaller electrode spacings on the micron scale is limited by the diffusion length.To overcome the difficulties above,we adopt a design with graphene ribbons tilted at an angle with respect to the metal grating,as shown in Figure1,panel g.In this device,the period of the graphene ribbon array is2μm,and the ribbon width is 0.6μm,similar to the device in Figure1,panels d−f.Bimetal electrodes(20nm chromium+25nm gold)are deposited on graphene ribbons using a two-step shadow evaporation technique(see Methods).The graphene ribbons were inclined at an angle ofθ=45°with respect to the metal contacts and have a length of5.7μm,which is less than the previous device but still reasonably long,to allow some transmission of both polarizations as will be discussed in the next section.Light polarized perpendicular to the metal grid(which does not suffer from the polarizer effect)now has an electricfield component perpendicular to the graphene ribbon axis and can therefore excite the transverse plasmon resonance.In this case,when the incident THz radiation is polarized perpendicular to the metallic grating,we can see evidence of gate-tunable plasmonic absorption in the attenuation spectrum,as shown in Figure1, panel h.This is in contrast to Figure1,panel e,where no plasmonic resonance can be seen for light polarized perpendicular to the metal electrode grating.We further explore the polarization dependence of the tilted-ribbon array.Figure2,panel a shows a color map of the polarization-dependent attenuation of the tilted ribbon array as described in Figure1,panels g−h at V g=V g,min+5.4V,whichFigure2.(a)Attenuation at V g=V g,min+5.4V as a function of the frequency(radial axis)and the incident polarization(azimuthal axis). Inset:A scanning electron micrograph of a similar device(left)and the schematic of the device with the defined polarized angleθof the incident light(right).The graphene ribbons are tilted45°to the metal electrodes.(b,c)Simulated charge density profile in the graphene−metal microstructure at the plasmon resonance frequency.The polarization of the incident plane-wave(7.4THz)is perpendicular to the graphene ribbons in panel b and parallel in panel c, corresponding to the points marked with black and white∗symbols in panel a,respectively.The same color scale is used for both panels.is the highest gate voltage(highest carrier density)we achieved. The color scale indicates the normalized attenuation. Considering the metal polarizer effect,the attenuation here is defined as A=(1−T high/T low)×f(ω,θ),where T high is the transmission at V g=V g,min+5.4V,T low is the transmission at V g =V g,min,and f(ω,θ)is the experimentally determined extinction factor of the metal grating(see Methods for detailed information).Here,the attenuation is plotted as a function of frequency(plotted along the radial direction)and polarization angle,as defined in the inset schematic.The left inset of Figure 2,panel a shows an scanning electron microscopy(SEM)image of a similar device fabricated in the same way.Because the attenuation is multiplied by f(ω,θ),the effect of the metal grating is included,and the polarization dependence is due to both the attenuation caused by graphene and metal grid. Additionally,the metal grid is symmetric with respect to polarizations at positive and negative angles±θ,so asymmetry for±θis caused by the tilting of graphene with respect to the metal grid.Indeed,we observe a highly asymmetric pattern of attenuation.When the angle of polarization is inclined in the direction parallel to the graphene ribbons(θ>0),we observe a Drude-like absorption spectrum,which decreases monotoni-cally with frequency.By contrast,when the angle of polarization is inclined in the direction perpendicular to the ribbons(θ<0), we observe a peak in attenuation at∼7.4THz,which we identify as the plasmon resonance frequency for these ribbons at this gate voltage.Figure2,panels b and c show the simulated charge density oscillations in our device structure at this frequency for two polarization anglesθ=±45°(perpendicular and parallel to the ribbons,marked with black and white∗in Figure2a),respectively(see Methods and the Supplementary Movie of the Supporting Information for detailed information). Compared to Figure2,panel c,which shows a very weak chargedensity oscillation,Figure2,panel b clearly displays a charge density wave excited by the incident electricfield polarized perpendicular to the ribbons,which supports the identification of the observed attenuation peak at7.4THz andθ<0as the transverse plasmon in our graphene−metal microstructure. We next discuss a similar device but with a smaller electrode spacing more compatible with enhanced photothermoelectric detection.The device is fabricated using the same technique as the device shown in Figure2,but here the graphene ribbon width is1.1μm,and the interelectrode spacing is3.8μm,which is closer to the estimated graphene hot carrier diffusion length to enhance the hot electron photothermoelectric effect and thus improve the detection efficiency.Ideally,an even shorter spacing could be adopted to make the device more dominated by diffusive cooling and put more light sensitive elements in series to enhance the photovoltage signal.The two-step shadow evaporation technique for asymmetric metal electrodes deposition is used so that each graphene channel(light sensitive part of the detector)has asymmetrical contacts(gold contact on the bottom edge and chromium contact on the top edge),which helps to generate a net photothermoelectric signal when the device is uniformly illuminated(see Methods).Figure 3shows the attenuation spectra at different gate voltages for the incident light polarized with three typical angles.Atθ=60°(Figure3a),because of the polarizing effect of the metal grid, which reduces the parallel component of the electricfield,the effective electricfield interacting with graphene is nearly parallel to the ribbons,which results in a dominant Drude response.At θ=−60°(Figure3c),the effective electricfield is close to perpendicular to the graphene ribbons,which excites the transverse plasmons in the graphene ribbon,leading to increased attenuation at the plasmon resonant frequency, which is in the range4−6THz.As expected,the plasmon frequency increases with charge carrier density,which is varied by applying a gate voltage.Interestingly,atθ=0°(Figure3b), the angle at which the incident light is minimally absorbed by the metal grid,a combined response is observed,especially at high gate voltage.Here the components of the electricfield parallel and perpendicular to graphene ribbons are nearly equal. At the highest gate voltage(magenta curve),the attenuation shows a local plasmonic peak at f≈5.3THz and also a Drude response at low frequency.Now we study the frequency and the polarization angle dependence of the attenuation at large positive gate voltage in more detail.Figure4,panel a shows the attenuation of the same device studied in Figure3at V g=V g,min+6.5V,the highest gate voltage(carrier density)achieved.Similar to Figure2, panel a,the color scale indicates the normalized attenuation.As shown in Figure4,panel a,the attenuation peaks nearθ=0°because the metal grating reflects a large portion of the incident light polarized in other directions owing to the small spacing between metal electrodes.There is a local maximum at the frequency of∼5.3THz corresponding to plasmon-enhanced attenuation,which is clearly separated from the Drude response at f<3THz.The plasmon peak is asymmetric in polarization angle with more weight at negative angle,while the Drude response occurs at positive angle.To understand the relationship between plasmonic excitation and polarization,we developed a simple plasmon conductivity model to predict the expected absorption in thegraphene Figure3.Attenuation at different V g normalized by the spectrum at V g,min as a function of frequency for a device with graphene ribbon width of1.1μm and interelectrode spacing of3.8μm.(a)The incident polarization angleθ=60°,corresponding to a Drude response,(b)θ= 0°,corresponding to a combined Drude and plasmon response,and (c)θ=−60°,corresponding to a plasmon response.The insets show schematics of the device and the polarization of the incident light for each measurement,respectively.ribbons (see Methods).The modeled attenuation is plotted in Figure 4,panel b in the same way as the experimental data shown in Figure 4,panel a.The only free parameters of the model are the carrier density n =1.6×1013cm −2and the mobility of graphene μ=800cm 2V −1s −1,which determines τ=37fs.According to the model,the resistivity of the device at this gate voltage is ∼500Ω,which is lower than the measured resistivity of 1.4K Ω.We attribute this di fference to the contact resistance contribution in the two-probe transport measure-ment across multiple graphene/metal junctions.The model reproduces the features of the experimental data.A stronger attenuation peak at finite frequency is both predicted and observed when the angle of polarization is inclined toward the direction perpendicular to the graphene ribbons,which signi fies the excitation of a transverse plasmonic resonance.Next we discuss the electrical response to THz radiation of the same device as in Figures 3and 4.Photoresponse measurements were performed using a continuous wave THz laser at 5.3THz as the source (see Methods).Figure 5,panel a shows the photovoltage as a function of the applied top gate voltage (radial axis,measured relative to the charge neutral point)and the polarization angle of the CW excitation(azimuth).As shown previously,10the photovoltage is generated by the photothermoelectric e ffect 35−37in graphene due to asymmetry of the electrodes.As reported in ref 10,this type of asymmetry leads to photothermoelectric voltage that is peaked near the Dirac point and monotonically decreases with the carrier density.Figure 5,panel b shows the modeled photoresponse as a function of gate voltage and polarization angle using the same parameters as in Figure 4,panel b,and a photothermoelectric model 10with asymmetry generated by both an extra contact resistance R c =35Ωat the gold electrode and the di fference of the work function between chromium and gold (see Methods).Both the experimental and modeled signals show maxima at small gate voltages where the photothermoelectric responsivity peaks.38−40In addition,when the gate voltage is low,the photovoltage is symmetric around θ=0°as the plasmon is only weakly excited in the low doped region.The signal for this device with a small metal spacing depends primarily on the polarizer e ffect of the metal electrodes and thus peaks with angle near θ=0°.At larger gate voltages,the photoresponse increases with increasing gate voltage.This rise is not due to increased responsivity,as observed earlier,10and explained within the asymmetricmetalFigure 4.(a)Experimental attenuation at V g =V g,min +6.5V as a function of frequency (radial axis)and the incident polarization (azimuthal axis)for the same device of Figure 3.(b)Simulated attenuation of the device shown in panel a using the model discussed in the text.The insets show schematics of the devices and de fine the polarization angle θ.Figure 5.(a)Measured magnitude of the photovoltage for a tilted graphene ribbon array photodetector as a function of V g (radial axis)and the incident polarization (azimuthal axis).The device is the same as in Figure 4,panel a,and the frequency of the laser excitation is 5.3THz (175cm −1).(b)Simulated photoresponse of the same device using the model discussed in the text.The insets show schematics of the devices and de fine the polarization angle θ.electrodes model the responsivity decreases monotonically with increasing gate voltage at high gate voltage.Instead,the increase is explained by enhanced absorption in the device, which is due to(1)increase in DC conductivity with increased gate voltage and(2)resonant plasmonic absorption.The shift of the peak in photoresponse with respect to angle toθ<0°clearly indicates that the plasmonic effect is dominant in increasing the absorption,similar to Figure4,panels a and b. To summarize,we have demonstrated a scheme for efficient THz excitation of resonant plasmons in graphene mircoribbon arrays contacted by metal electrodes with spacing much smaller than the free space wavelength.Resonant plasmon absorption enhances the absorption of radiation by graphene and therefore increases the external efficiency of graphene photothermo-electric detectors.Additionally the plasmon resonance is tunable through both geometry(ribbon width)and carrier density,enabling spectral resolution and tunability in graphene photothermoelectric detectors.In the device demonstrated here,the spectral resolution quality factor Q=ωpτ=1.2,is imited by the fairly low mobility of epitaxial graphene.Hence, for the present device,the THz attenuation is comparable in magnitude for the Drude and plasmonic absorption,as seen in, for example,Figure3.However,our scheme has significant advantages if the mobility of the graphene can be increased, increasing scattering timeτ,which determines the width of both the Drude response and plasmon resonance,achieving a high quality factor Q=ωpτand large separation between Drude and plasmon responses.In addition,since the DC conductivity of graphene isσ=neμ,high mobility graphene would enable a strong plasmon resonance peak(which is proportional to the DC conductivity of the graphene sheet)at low doping where the thermoelectric response is maximized.Single-element graphene photothermoelectric detectors based on Drude absorption10have already shown an unprecedented combina-tion of responsivity,NEP,and speed in few THz detection,and our scheme provides a route forward,as higher mobility is achieved in higher quality graphene,to detectors with higher efficiency(due to higher plasmonic absorption)and better spectral sensitivity(due to narrower plasmon resonance). Methods.The starting material is epitaxial single-layer graphene on(0001)semi-insulating(resistivity>109Ω-cm) 6H-SiC;see ref41for additional details.The2D graphene is patterned into a ribbon array using electron beam lithography with400nm thick PMMA[poly(methy methacrylate),Micro Chem Corp.]resist as an etch mask and oxygen plasma treatment to remove exposed graphene.Chromium/gold electrodes(thickness4nm/45nm)are thermally evaporated for the devices shown in Figure 1.For the bimetallic photothermal detectors,the liftoffmask is patterned via e-beam lithography using a bilayer resist[methyl methacrylate (8.5%)/methacrylic acid copolymer(MMA),Micro Chem Corp.;and PMMA].Dissimilar metal contacts are fabricated in one lithographic step using a tilted-angle shadow evaporation technique42for the devices shown in Figures2−5.Chromium (20nm)and gold(25nm)are deposited at different evaporation angles.As afinal step,a droplet of electrolyte (LiClO4/PEO=0.12:1)is used to cover the whole device for applying top gate voltages.Far infrared transmission measurements are performed in a BOMEM DA-8FTIR system with mercury lamp as a source and4K silicon composite bolometer as a detector.The2×2 mm2device is mounted on a copper plate with a2mm diameter aperture.The mounted sample is placed in vacuum at room temperature and is uniformly illuminated by the incident beam of8mm in diameter.We strongly overfill the sample aperture to minimize spectrometer diffraction losses at low frequencies.An electronically controlled rotating wire grid polarizer is placed in front of the sample.To minimize time drift of the signal,we consecutively measure transmitted spectrum through the device and an identical bare aperture placed in the sample position at each gate value,and their ratio gives us the absolute transmission.Finally,we divide all transmission spectra by the transmission spectrum measured at the Dirac point.Model calculations mimic this experimental procedure.The THz photoresponse is characterized by illuminating the device with a chopped continuous wave laser beam and detecting the open-circuit photovoltage signal using a voltage preamplifier and lock-in amplifier.The THz laser is optically pumped by CO2-laser resonator with Methanol-D(CH3OD) vapors generating a line at5.3THz(175cm−1)frequency.The sample is mounted on the same copper plate as in the FTIR measurements,and the beam illuminates the device through the SiC substrate to avoid the absorption by the electrolyte.The same rotating polarizer is placed in front of the focusing parabolic mirror(D=F=50mm).The photovoltage is continuously normalized by the signal of the pyroelectric reference detector.The sample is mounted on an x-y-z scanning stage together with another pyro-detector,which is used for the power calibration(including signal for rotating polarizer). The charge density oscillation at plasmon resonance frequency was obtained using afinite element method frequency-domain simulation.Plane-wave excitation(7.4 THz)was simulated with a polarization parallel and perpendicular to graphene ribbons.The geometrical parameters of the element are the same as the real device described in the text.The carrier density of graphene was taken to be2×1013 cm−2.The mobility was taken to be5000cm2V−1s−1,which is possibly higher than that of the real device,to illustrate the plasmon mode more clearly.To model the relative attenuation through the device at different gate voltages,wefirst calculate the transmission of the graphene ribbons using the thin-film expression:43T= ((4n1n2)/(|n1+n2+Z0σ|2)),where n1=1.73and n2=3.1 are the refractive indices of the electrolyte and SiC substrate,Z0 =377Ωis the impedance of free space,andσis AC conductivity of graphene.The AC conductivityσcan be written asσd=σ0/(1+iωτ)for Drude response andσp=σ0/(1+i(ω2−ωp2)τ/ω)for plasmon excitation,whereσ0is the DC conductivity,ωis the frequency,τis the electron scattering time,andωp is the plasmon resonance frequency.Bothσ0andτcan be expressed as a function of the carrier density n and mobilityμof graphene,written asσ0=neμandτ=(πn)1/2ℏμ/ ev F,where e is elementary charge,and v F is the Fermi velocity. The relative attenuation is then expressed asΔT=1−T(V g)/ T(V g,min).Tofit the attenuation spectra shown in Figure1, panel c,we take afixedμ=1300cm2V−1s−1and set n andωp asfitting parameters.To plot the polarization-dependent attenuation through the device shown in Figure2,panel a and Figure4,panel a,wefirst calculate the effective average electricfield seen by graphene,which is estimated as electric field of the incident beam corrected by the extinction factor (f(ω,θ))1/2of the metal grating.f(ω,θ)is defined as f(ω,θ)= cos2(θ)+sin2(θ)×Φ(ω),whereΦ(ω)∈[0,1]is the ratio of the measured transmission atθ=90°and0°when the device is at the charge neutral point.The polarization-dependent。
APS 审核复习digital image processing
Digital image processing is the use of computer algorithms算法to perform image processing on digital images. As a subcategory子种类or field of digital signal processing, digital image processing has many advantages over analog image模拟图像processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and signal distortion变形失真during processing. Since images are defined over two dimensions 二维(perhaps more) digital image processing may be modeled in the form of multidimensional systems.A digital image processing allows the use of much more complex algorithms for image processing, and hence因此, can offer both more sophisticated复杂的performance at simple tasks, and the implementation of methods which would be impossible by analog means.ApplicationsDigital camera imagesDigital cameras generally include dedicated专用的digital image processing chips to convert转换the raw data原始数据from the image sensor图像传感器into a color-corrected image in a standard image标准图像file format文件格式. Images from digital cameras often receive further processing to improve their quality, a distinct独特的明显的advantage that digital cameras have over film cameras. The digital image processing typically is executed by special software programs that can manipulate the images in many ways.Intelligent transportation systemsDigital image processing has a wide applications in intelligent transportation systems智能交通系统, such as automatic number plate recognition自动车牌识别and traffic sign recognition交通标志识别.digital image acquisition图像获取Digital imaging or digital image acquisition is the creation of digital images, typically from a physical scene场景. The term is often assumed to imply or include the processing, compression, storage, printing, and display of such images. The most usual method is by digital photography with a digital camera but other methods are also employed.Digital imaging was developed in the 1960s and 1970s, largely to avoid the operational weaknesses of film cameras, for scientific and military missions including the KH-11 program. As digital technology became cheaper in later decades it replaced the old film methods for many purposes.MethodsA digital photograph may be created directly from a physical scene by a camera or similar device. Alternatively, a digital image may be obtained from another image in an analog medium, such as photographs, photographic film, or printedpaper, by an image scanner or similar device.Many technical images—such as those acquired with tomographic equipment, side-scan sonar, or radio tele scopes—are actually obtained by complex processing of non-image data. Weather radar maps as seen on television news are a commonplace example. The digitalization of analog real-world data is known as digitizing, and involves sampling (discretization) and quantization.Finally, a digital image can also be computed from a geometric model or mathematical formula. In this case the name image synthesis is more appropriate, and it is more often known as rendering.Digital image authentication is an issue for the providers and producers of digital images such as health care organizations, law enforcement agencies and insurance companies. There are methods emerging in forensic photography to analyze a digital image and determine if it has been altered.image enhancementCamera or computer image editing programs often offer basic automatic image enhancement features that correct color hue and brightness imbalances as well as other image editing features, such as red eye removal, sharpness adjustments, zoom features and automatic cropping. These are called automatic because generally they happen without user interaction or are offered with one click of a button or mouse button or by selecting an option from a menu. Additionally, some automatic editing features offer a combination of editing actions with little or no user interaction. Image restorationImage restoration is the operation of taking a corrupted/noisy image and estimating the clean original image. Corruption may come in many forms such as motion blur, noise, and camera misfocusimage coding or Digital data compressionMany image file formats use data compression to reduce file size and save storage space. Digital compression of images may take place in the camera, or can be done in the computer with the image editor. When images are stored in JPEG format, compression has already taken place. Both cameras and computer programs allow the user to set the level of compression.Some compression algorithms, such as those used in PNG file format, are lossless, which means no information is lost when the file is saved. By contrast, the JPEG file format uses a lossy compression algorithm by which the greater the compression, the more information is lost, ultimately reducing image quality or detail that can not be restored. JPEG uses knowledge of the way the human brain and eyes perceive color to make this loss of detail less noticeable.。
Reconstruction of surfaces of revolution from single uncalibrated views
2 Properties of Surfaces of Revolution
´×µ ´×µ ¼ Ì be a regular and differentiable planar curve on the Ü-Ý Let Ö ´×µ plane where ´×µ ¼ for all ×. A surface of revolution can be generated by rotating Ö about the Ý -axis, and is given by
Ë ´×
Ö
¾
µ
´×µ Ó× ´×µ ´×µ × Ò
¿
(1)
where is the angle parameter for a complete circle. The tangent plane basis vectors
£ This project is partially funded by The University of Hong Kong.
93
if some strong a priori knowledge of the object is available, such as the class of shapes to which the object belongs, then a single view alone allows shape recovery. Examples of such techniques can be found in [7, 14, 10, 15, 6, 20, 17, 19], where the invariant and quasi-invariant properties of some generalized cylinders (GCs) [2] and their silhouettes were exploited to derive algorithms for segmentation and 3D recovery of the GCs under orthographic projection. This paper addresses the problem of recovering the 3D shape of a surface of revolution (SOR) from a single view. Surfaces of revolution belong to a subclass of straight homogeneous GCs, in which the planar cross-section is a circle centered at and orthogonal to its axis. This work is different from the previous ones in that, rather than the orthographic projection model, which is a quite restricted case, the perspective projection model is assumed. In [9], Lavest et al. presented a system for modelling SORs from a set of few monocular images. Their method requires a perspective image of an “angular ridge” of the object to determine the attitude of the object, and it only works with calibrated cameras. The algorithm introduced here works with an uncalibrated camera, and it estimates the focal length of the camera directly from the silhouette. Besides, an “angular ridge” is not necessary as the algorithm produces a 2-parameter family of SORs under an unknown attitude and scale of the object. This paper is organized as follows. Section 2 gives the theoretical background necessary for the development of the algorithm presented in this paper. A parameterization for surfaces of revolution is presented and the symmetry properties exhibited in the silhouettes are summarized. In particular, the surface normal and the revolution axis are shown to be coplanar. This coplanarity constraint is exploited in Section 3 to derive a simple technique for reconstructing a surface of revolution from its silhouette in a single view. It is shown that under a general camera configuration, there will be a 2-parameter family of solutions for the reconstruction. The first parameter corresponds to an unknown scale in the reconstruction resulting from the unknown distance of the surface from the camera. The second parameter corresponds to the ambiguity in the orientation of the revolution axis on the Ý -Þ plane of the camera coordinate system. It is shown in the Appendix that such ambiguities in the reconstruction cannot be described by a projective transformation. The algorithm and implementation are described in Section 4 and results of real data experiments are presented in Section 5. Finally conclusions are given in Section 6.
漫画头像制作
PS把自己得照片改为漫画艺术风格a) Preparing the CanvasOpen your picture in photoshop that is going to become pop art and duplicate the layer called "Background". (Just click the layer called "Background"and drag it to this icon at the bottom o f the layer window to duplicate the layer.)Rename this new layer "dots". (To rename a layer right-click on the name in the Layer Palette and select Layer Properties).Create another new layer and fill it bright blue using the Paint Bucket toolDrag this layer between the two existing layers and rename it something meaningful i.e "blue". The image below shows what your Layer Palette should look like. This is the basic set up to begin.Now working on the "dots" layer we need to clear out all the unwanted parts of the photograph. In this caseI want to isolate Scarlett and delete the rest i.e lilac background and bit of text.To cut out Scarlett I use the Pen Tool. Now to sum up how to use Photoshop's pen tool in a few sentences isn't easy...if you have never used the pen tool before, do the first.Remember to make sure the pen tool is set to create a Work Path.See below.*NOTE: You could use the eraser tool but the results won't be as professional.Take the pen tool create a path around the person (or thing) and then make it into selection. Invert the selection (CTRL + Shft + i) and hit delete. Deselect (Ctrl + D)b) Making the DotsDesaturate the "dots" layer (Ctrl + Shft + U).Next adjust Threshold to something dramatic (still working on the "dots" layer).Image >> Adjustments >> Threshold...The settings I used are shown in the image below, but you will need to experiment to see what t hreshold settings works for YOUR imageUsing Threshold will leave the image looking very pixelated (jagged). So apply Gaussian Blur (approximately a 2-3 pixels should do it). Filter >> Blur >> Gaussian Blur...In your Layers Palette right-click on the "dots" layerand select Duplicate Layer.... See image belowSelect New for the Destination Document. Now you will have 2 documents open in photoshopWorking on your new document change the Mode to Greyscale.Image >> Mode >> GreyscaleA dialogue box appears..."Discard color information?"...Click OK.Now change the Mode to Bitmap.Image >> Mode >> BitmapA dialogue box appears..."Flatten layers?"...Click OK.Choose Halftone Screen on the Bitmap options window. Click OK. See image below.Next appears the Halftone Screen box. Apply the settings shown below. Note you may want to experiment with the Frequency as this decides the size of the dots. Click OK.Almost there with the dots.All that's left is to transfer the dots back to the first document. (See part C).c) OrganisingChange the Mode back to GreyscaleA dialogue box will appear...Make sure the size ratio is 1 and click OK.Now change the Mode back to RGB.In your Layers Palette right-click on the layer and select Duplicate Layer.There should be 3 three options as the Destination Document. Choose your original psd (which should be the top one). See image below.Your Layers Palette should look like the one shown below. You can close the second psd that you created..there's no need for it nowThe final step for creating the dots is to create a Clipping Path between the new layer and the "dots" layer.To create a Clipping Path...hold down the ALT key and move your cursor between the 2 layers in your Layers Palette. When the cursor turns into a "double bubble" (see image below) click to create the clipping path.Now link the "dots" and "Background copy" layer. See image below.Merge Linked layers (Ctrl + E)If you are using Photoshop CS or CS2 instead of linking and then merging the linked layers, after creating the clipping path simply click on the "background copy" layer in your layer palette and then Merge Down.Your picture should now look like the one shown below..only better because it's not so small and compressed!Now is a good time to Save (Ctrl + S) your work.STEP 2. Adding ColourHaving achieved a great looking half tone effect, it's time to add the colour.I'm going to use Fill Layers to colour this picture. Fill Layers are great if you are indecisive about your colour palette and fantastic for colour experimentation for pop art.At the bottom of this page I will briefly show how intermediate level photoshop users can really take their images one step further. Combining this tutorial with my tutorial you can really achieve stunning results a) Creating Fill LayersDuplicate the "dots" layer. Rename this layer "white" and drag it below the "dots" layer in your layer palette.Adjust the Brightness/Contrast on the "white" layer.Image >> Adjustments >> Brightness/Contrast...Set the Brightness to +100, and adjust Contrast until whiteNow the "white" layer is white! Your layers palette should look like the one shown belowChange the blend mode on the "dots" layer to Multiply.Click on this icon at the bottom of your Layer Palette and select Solid Colour...Select a colour in the Colour Picker dialogue box. Click OK. I've gone for a red shade to colour h er lips. You don't need to be too picky here because we can change the colour easily later onThe new fill layer created will appear in your layer window. Right-click and rename the layer e.g "lips" to colour the lips.Drag the new fill layer below the "dots" layer in your layer palette. See below. Your image is now totally filled with the colour of the fill layer...but don't worry we are about to fix that.Create a Clipping Path between the "lip" and "white" layer. This will ensure that you don't colour over the lines ^_^Change the foreground colour to black. Working on this new layer, take your Paint Bucket Tool and fill the "lips" layer black. The colour disappears...this is because the colour will only show up where there is white on this fill layer.Now for the colouring. Change the foreground colour to white. Get your Paint Brush Tool and start painting where you want to the colour to appear. Use a hard brush with the opacity set to 100%. Make sure that you zoom in when colouring, so that it is nice and tidy!**Fill layers can be confusing if you haven't used them before.If you are stuck try reading THIS PAGe, it's from another of my tutorials where I explain fill layers a little more indepth.b) More Fill LayersRepeat the above step creating a new fill layer for each colour/item in your picture. Below is my layers palette.Note how I have used clipping paths on all the new fill layers.At any time if you are unhappy with a colour that you have chosen, simply double click on the Layer Thumbnail(as shown above) and re-select a colour. Now you have the ability to change the colour of the hair for example, to a whole new colour in a second flat!!You're pretty much finished.You will need to add a caption, or a speech or thought bubble to make this a Lichtenstein inspired piece. Custom shapes have a few speech and thought bubbles to choose from.The font I used in my finished picture is ANIME ACE.Try experimenting with the colours too for something really bold.COMBINGING WITH LINE ART TUTORIALI've had a lot of people ask me how I create half tone shading on my line art pictures.to see my finished Scarlett Johansson pop art piece.I've written two tutorials on creating the line art.-Turn Photos of People into Line Art-Create Basic Line Art form Your PhotosThe picture of my layer palette below pretty much explains it all.Obviously the line art is the top layer.Duplicate the half tone layer and place it over each colour layer with a clipping path.Set the mode of the half tone layer to Soft Light (or something similar).Adjust the opacity of the half tone layer until it looks good.If you are interested in creating the pattern I used for the background of my final Scarlett picture, the pattern is made with a custom shape.If you are using Photoshop 7, CS or CS2 then you will have it already in your custom shapes (the arrow, marked 2, is pointing to it in the above image). You will need to select Show All to see it (see the image above, click on the area, marked with the 1 arrow to reveal custom shape options -Show All). If you are using an earlier version of Photoshop then you need to make it. Check out my Digital Candy Tutorial(just do the first page).That's it!I hope you found this Photoshop tutorial helpful. Feel free to contact me via my contact page if you have any questions.Also check out the following page to see some fantastic art that others have created by following this tutorial...。
diffusion model简书
diffusion model简书(中英文实用版)Title: A Brief Introduction to Diffusion Models标题:扩散模型简述Diffusion models have gained immense popularity in the field of machine learning, particularly in the generation of images, text, and audio.扩散模型在机器学习领域变得非常流行,尤其是在图像、文本和音频的生成方面。
The core concept of diffusion models lies in simulating the process of data distribution evolving from a noisy state to a clean, high-quality state.扩散模型的核心概念是模拟数据分布从噪声状态演变为干净、高质量状态的过程。
These models are trained to reverse this process, effectively generating new, high-quality data from random noise.这些模型被训练来逆转这一过程,有效地从随机噪声生成新的、高质量的数据。
One of the key advantages of diffusion models is their ability to generate data with high fidelity and diversity, making them highly suitable for various applications such as image synthesis, text generation, and audio processing.扩散模型的一大优势是它们能够生成高保真度和多样性的数据,使它们非常适合各种应用,如图像合成、文本生成和音频处理。
a转换成膜a算法
a转换成膜a算法As technology continues to advance at a rapid pace, it is crucial for algorithms to evolve and adapt accordingly. One algorithm that has garnered significant attention in recent years is the "a to b" algorithm. This algorithm, which aims to convert 'a' into 'b', is essential for various applications in data processing and analysis.随着技术的快速发展,算法需要不断演变和适应。
最近几年受到广泛关注的一种算法是“a转换成b”算法。
这种算法旨在将'a'转换为'b',在数据处理和分析等各种应用中起着关键作用。
One of the primary requirements for the "a to b" algorithm is efficiency. The algorithm should be able to convert 'a' into 'b' quickly and accurately, without compromising the quality of the output. This efficiency is essential for applications where real-time processing and response are crucial.“a转换成b”算法的主要要求之一是效率。
An image
An image(from Latin: imago) is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person. 图像是人对视觉感知的物质再现。
图像可以由光学设备获取,如照相机、镜子、望远镜、显微镜等;也可以人为创作,如手工绘画。
图像可以记录、保存在纸质媒介、胶片等等对光信号敏感的介质上。
随着数字采集技术和信号处理理论的发展,越来越多的图像以数字形式存储。
A mental image exists in an individual's mind: something one remembers or imagines. The subject of an image need not be real; it may be an abstract concept, such as a graph, function, or "imaginary" entity. For example, Sigmund Freud claimed to have dreamed purely in aural-images of dialogs. The development of synthetic acoustic technologies and the creation of sound art have led to a consideration of the possibilities of a sound-image made up of irreducible phonic substance beyond linguistic or musicological analysis. Imagery, in a literary text, occurs when an author uses an object that is not really there, in order to create a comparison between one that is, usually evoking a more meaningful visual experience for the reader.[1] It is useful as it allows an author to add depth and understanding to his work, like a sculptor adding layer and layer to his statue, building it up into a beautiful work of art.Imagism was a movement in early 20th-century Anglo-American poetry that favored precision of imagery and clear, sharp language. The Imagists rejected the sentiment and discursiveness typical of much Romantic and Victorian poetry. This was in contrast to their contemporaries, the Georgian poets, who were by and large content to work within that tradition. Group publication of work under the Imagist name appearing between 1914 and 1917 featured writing by many of the most significant figures in Modernist poetry in English, as well as a number of other Modernist figures prominent in fields other than poetry. Based in London, the Imagists were drawn from Great Britain, Ireland and the United States. Somewhat unusually for the time, the Imagists featured a number of women writers among their major figures. Imagism is also significant historically as the first organised Modernist English language literary movement or group. In the words of T. S. Eliot: "The point de repère usually and conveniently taken as the starting-point of modern poetry is the group denominated 'imagists' in London about 1910."[1]At the time Imagism emerged, Longfellow and Tennyson were considered the paragons of poetry, and the public valued the sometimes moralising tone of their writings. In contrast, Imagism called for a return to what were seen as more Classical values, such as directness of presentation and economy of language, as well as a willingness to experiment with non-traditional verse forms. The focus on the "thing" as "thing" (an attempt at isolating a single image to reveal its essence) also mirrors contemporary developments in avant-garde art, especially Cubism. Although Imagism isolates objects through the use of what Ezra Pound called "luminous details", Pound's Ideogrammic Method of juxtaposing concrete instances to express an abstraction is similar to Cubism's manner of synthesizing multiple perspectives into a single image意象主义是二十世纪初的英美诗歌运动,其诗作富有细致意象,并使用清晰、俐落的文字。
2009-p133-hancock
Sticky Tools:Full6DOF Force-Based Interaction for Multi-Touch Tables Mark Hancock1,Thomas ten Cate1,2,Sheelagh Carpendale11University of Calgary Department of Computer Science {msh,sheelagh}@cpsc.ucalgary.ca2University of Groningen Department of Computer Science t.ten.cate.1@student.rug.nlABSTRACTTabletop computing techniques are using physically familiar force-based interactions to enable compelling interfaces that provide a feeling of being embodied with a virtual object. We introduce an interaction paradigm that has the benefits of force-based interaction complete with full6DOF manip-ulation.Only multi-touch input,such as that provided by the Microsoft Surface and the SMART Table,is necessary to achieve this interaction freedom.This paradigm is real-ized through sticky tools:a combination of stickyfingers,a physically familiar technique for moving,spinning,and lift-ing virtual objects;opposable thumbs,a method forflipping objects over;and virtual tools,a method for propagating behaviour to other virtual objects in the scene.We show how sticky tools can introduce richer meaning to tabletop computing by drawing a parallel between sticky tools and the discussion in Urp[20]around the meaning of tangible devices in terms of nouns,verbs,reconfigurable tools,at-tributes,and pure objects.We then relate this discussion to other force-based interaction techniques by describing how a designer can introduce complexity in how people can control both physical and virtual objects,how physical objects can control both physical and virtual objects,and how virtual objects can control virtual objects.INTRODUCTIONIn the physical world,an object reacts to a person’s actions depending on its physical properties and the forces applied to it.For example,a book can be stacked on top of another because it has twoflat sides or a pencil can be rolled along a desk because it is cylindrical.People often make use of the unique properties of objects to make them affect other objects in different ways.People use pencils to write,ham-mers to insert nails,and utensils to cook food.In the virtual world,how objects react to human intervention depends on a particular mapping of human movement to computer feed-back.For example,pressing a button with a mouse cursor can cause a variety of behaviour,including opening a menu, advancing to the next page of a document,or invoking a new Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.ITS’09,November23–25,2009,Banff,Alberta,Canada.Copyright©2009978-1-60558-733-2/09/11...$10.00Figure1:A screenshot of a3D virtual scene.window to appear.There are benefits to both worlds;in the physical world,people become familiar with the capabilities of the tools they use regularly;in a virtual world the result of a person’s actions can be made to either use or extend physical limits.Since tabletop displays afford direct touches for interaction, the techniques have a feeling of being more physical than, for example,mouse or keyboard interaction.This directness of interaction with virtual objects opens up the potential for interactive tables to simultaneously leverage the benefits of both the physical and the virtual.The research question is: how does one maintain the feeling of physical interaction with the full capabilities to manipulate a3D scene such as in Figure1?Many of the techniques that have been designed specifically for digital tables are based(either explicitly or implicitly)on how objects react in the physical world.How-ever,these techniques typically resort to techniques such as gestures[1,22]or menus[18]to provide full functionality.We introduce sticky tools—virtual3D tools that can be ma-nipulated in the full six degrees of freedom(DOF)of trans-lation and rotation—to allow force-based interaction to pro-vide full control of a system,without the need for gestures or menus.Wefirst describe the historical progress of force-based interaction,we then introduce sticky tools,and then we demonstrate how sticky tools can be used to assign richer meanings to virtual objects.We end with a discussion of how sticky tools leverage the largely unexplored research area of 133how virtual objects interact with other virtual objects and describe how this research direction can overcome existing limitations in tabletop interaction.RELATED WORKDigital tables have used force-based interaction since they were introduced,both explicitly through metaphor and im-plicitly through2D or3D manipulation.Force-Based MetaphorsMany tabletop display interfaces use force-based metaphors to create compelling new interactions.The Personal Digital Historian[17]uses the idea of a“Lazy Susan”to invoke the metaphor of spinning virtual objects to another side of the table.The Pond[19]uses the metaphor of a body of water where virtual objects can sink to the bottom over time. Interface currents[9]demonstrate how the idea offlow can be applied to virtual objects;virtual objects can be placed in a dedicated area on the table that acts like a river,carrying the virtual objects to another part of the screen.A more abstract property of force-based interaction is that lo-cal actions only cause local behaviour,though this behaviour can then propagate to have a larger area of influence.For example,dropping a stone in water initially affects a small area,and over time its ripples eventually affect the entire body of water.Isenberg et al.[10]integrated this locality property into a framework for building tabletop display in-terfaces.With this framework,tabletop interfaces can be created where virtual objects adhere to this property.2D Force-Based InteractionA significant body of tabletop literature focuses on how to move and rotate virtual objects on a digital surface.One of the overarching results of studies[13,14,21]involving movement and rotation is that simulating(at least to some degree)how movement and rotation happen with physical forces typically results in both improved performance and a compelling feeling of embodiment with the virtual objects. The rotate n’translate(RNT)technique[12]for moving and rotating objects uses the metaphor of an opposing force act-ing on a virtual object to make it rotate while moving.This technique has also been extended so that,when let go,an object will continue along its trajectory according to the cur-rent speed of movement.This extension produces the ability to“flick”or“toss”objects across the screen[9,10].The TNT techniques[14]use3DOF to more directly simulate the movement observed in studies of moving and rotating paper on physical tables.With this method,a person can place their hand or a physical block on a virtual object and the position and orientation of the hand or block controls the movement and rotation of the virtual object.On multi-touch tables,twofingers are typically used for a combined move-ment,rotation and scaling of a virtual object.The position of thefirst touch is used to determine the movement of the object and the position of the second touch relative to the first is used to determine the rotation and scale.This tech-nique simulates how movement and rotation can occur with physical objects if frictional force between thefingers and objects is considered.The scaling aspect is an example of how this familiar force-based behaviour can invoke virtual behaviour not possible in the physical world(i.e.,magically growing or shrinking objects).ShapeTouch[4]provides force-based interactions on2D vir-tual objects,such as pushing objects from the side,tossing them across the screen,peeling them back to place other objects underneath,and more.These techniques use the sensory data to invoke complex but physically familiar be-haviour on the objects that are in direct contact with a per-son’s hands and arms.3D Force-Based InteractionHancock et al.[7]extended the idea of moving and rotating objects to three-dimensional virtual objects on a table.They used the same metaphor as RNT of an opposing force with a single touch point on an object.Their studies show,however, that the feeling of“picking up”an object is better approxi-mated by using morefingers(up to three).With threefingers, people can have full control of the6DOF of movement and rotation in a3D environment[7].Force-based effects such as collisions,gravity,mass,and inertia can also be integrated into3D environments through the use of a physics engine (e.g.,BumpTop[1]).The image data provided through many multi-touch input devices(FTIR[6],Microsoft Surface1, SMART Table2)can be more directly integrated into such physics engines by creating physical bodies(either through proxies or particle proxies)that then can interact with the virtual objects through the physics engine[21].Because a person’s hands andfingers(or even other physical objects) have a virtual representation in the physics engine,these can be used to push other virtual objects around.The use of forces in general and the use of physics-based forces in3D virtual worlds in particular,have immense ap-peal as a basis for interaction on multi-touch tables.How-ever,while many appealing interactions have emerged,they fall short of the full functionality required for practical ap-plications.For instance,BumpTop[1]resorts to a symbolic gestural language,which has an associated learning curve and the need for memory retention.Wilson et al.[21]point the way to interactions that extend physical real-world re-sponses into the virtual world,but fall short in that the re-alized virtual interactions provide only the ability to move invisible proxies,and not to spin,flip,or lift the virtual objects.In essence,this work provides no equivalent to an opposable thumb and has made a direct call for the ability to pick objects up and place them inside others–capabili-ties offered by our sticky tools approach.Another approach to manipulating2D and3D objects is to use the space in front of the display[11,15]to extend interaction capabilities, however,this has been only accomplished through additional hardware such as markers and vision-based systems.Sticky tools achieves all6DOF without additional hardware. STICKY TOOLS1Microsoft Surface./surface2SMART Table./table134The ACM International Conference on Interactive Tabletops and Surfaces2009(a)Move(b)Rotate2D(c)Lift(d)Rotate 3DFigure 2:Sticky fingers and opposable thumbs.In this section,we introduce sticky tools ,a combination of three concepts:sticky fingers ,opposable thumbs ,and virtual tools .We use sticky fingers and opposable thumbs to enable full control of a single object in a 3D virtual scene.We then use the virtual tools concept to take this full control of a single object and use it to enable full functionality within that system.Thus,sticky tools are a mechanism to improve upon existing force-based interaction techniques so that they can provide full functionality to a multi-touch table,without the need for symbolic gestures or menus.Of the three significant aspects of controlling single 3D vir-tual objects that are discussed in the 3D interaction literature (selection,navigation,and manipulation [3])we focus on se-lection and manipulation.Selection and some manipulation are available via sticky fingers,full manipulation of single objects requires the addition of the opposable thumb.The possibility of navigation is realizable with virtual tools and is discussed as future work.Sticky FingersIn 2D multi-touch interfaces,the two-finger move /rotate /scale interaction technique has become ubiquitous.Be-cause one’s fingers stay in touch with the virtual object in the location they initially contact,this can be referred to as a sticky-finger interaction.This perception of touching the virtual object persists through the interaction,providing some of the feedback one might expect in the physical world.The scaling action of spreading one’s fingers also maintains stickiness,still providing a person with the feeling that they are controlling two points on the virtual object.However,this scale aspect would be impossible in the physical world (at least for rigid bodies),thus it combines the partial phys-icality of the sticky fingers with the potential for magic that the computer offers.Sticky fingers works well in 2D,providing move (in x and y),spin (rotate about z)and scale.In 3D the first two of these capabilities can be directly mapped giving move and spin,however in 3D two additional factors are missing:lift and flip.To address this we extend the 2D sticky-fingers tech-nique together with ideas from the three-finger technique described by Hancock et al.[7]to create a technique to manipulate 3D virtual objects rendered in perspective.When only two fingers are used,the points of contact remain under one’s fingers.Similarly to the 2D technique,as the fingers move about the display,the virtual object moves with them (Figure 2a),and as the fingers rotate relative to one another,so does the virtual object (Figure 2b).In 3D,as the distance between the fingers gets larger,the virtual object moves towards the perspective viewpoint causing the object to appear larger (Figure 2c).Thus sticky fingers in 3D provides lift.Note that the virtual object’s size in the 3D model will not change,only its distance to the viewpoint.Sticky Fingers &Opposable ThumbsWith two sticky fingers alone,one can not flip a virtual ob-ject over while maintaining the stickiness property,since the initial contact points are likely to become hidden.To flip the object about x and y,the third finger is used as relative input,providing rotation about the axis orthogonal to the direction of movement (Figure 2d).The third finger is the opposable thumb .Unlike actual thumbs,one can use any finger to provide the virtual flipping functionality that our opposable thumbs provide in the real world.Instead of mapping the first two fingers to move (in x and y),rotate (about z),and scale,we map them to move (in x,y,and z)and rotate (about z).The third finger is then used for rotation about x and y.This technique provides full control of all 6DOF,enabling behaviour such as lifting objects and flipping them over.It is possible to maintain the stickiness property of the first two fingers when the third finger is active by using the axis defined by these two fingers as the axis of rotation.The disadvantage,however,is that movement along this axis with the third finger would not have any effect on the virtual ob-ject,and achieving the desired rotation may require defining a new axis of rotation (by lifting one’s fingers and reselecting the axis with the first two fingers).This disadvantage led to the design decision to use relative interaction for the third touch for 3D rotations.For selection,we use a combination of crossing [2]and stan-dard picking to select the initial point of contact for each of the three fingers.Thus,the order that the fingers come in contact with a virtual object determine the points used for movement and rotation.By extension,in a multi-touch environment,where for instance the flat of one’s hand orThe ACM International Conference on Interactive Tabletops and Surfaces 2009135one’s forearm could be interpreted as a series of touches, all objects crossed would be moved with one’s hand.As a result,a person can use theirfingers and arms to perform actions on multiple objects simultaneously(e.g.,sweeping all objects to the side).This sweeping action relates to the sweeping actions in Wilson et al.[21]without requiring the use of the physics engine.Virtual ToolsWhile together stickyfingers and opposable thumbs provide a way to select and fully manipulate a single2D or3D virtual object,more complex interactions,such as using an object to push another object around or changing an object’s prop-erties(e.g.,density,colour)are not possible.We introduce virtual tools to enable more complex interactions on virtual objects.A virtual tool is a virtual object that can act on other virtual objects and is able to cause changes to the receiving object.Any virtual object that is controlled with stickyfin-gers and opposable thumbs becomes a sticky tool.While virtual tools can exist in any virtual environment,we realized our virtual tools within a simulated real world by using a physics engine3.Thus,when a person interacts with a virtual object,it is placed under kinematic control so that other virtual objects will react physically to its movement, but the contact with the stickyfingers gives control of the object to thefingers.Thus,the object can now be used to hit other objects,but will not be knocked from the sticky contact.When the sticky tool makes contact with another object,it can cause physically familiar behaviour but these contacts can also be detected and made to invoke abstract actions,such as re-colouring the object.The concept of sticky tools is useful in explaining previous work.The technique introduced by Wilson et al.[21]can be thought of as an example of a very simple virtual tool.Their interaction technique can be described as controlling the2D position of many invisible virtual objects and these invisible objects interact with other objects in the scene through the use of a physics engine.In this framework the proxies can be considered to be a virtual tool whose behaviour is always to invoke frictional and opposing forces on other virtual ob-jects.Similarly,the joint technique used in BumpTop[1] allows3D icons to act as virtual tools that cause collisions that invoke behaviour on other3D icons.Table1shows a comparison of the features of joints(J), proxies(P),stickyfingers(SF),and stickyfingers with op-posable thumbs(SF+OT).They are compared on many commonly provided multi-touch interactions.Stickyfingers and opposable thumbs offer a more complete set of these interactions than any other;however,all have some gaps and this is not a complete list of all possible functionality.For any of these approaches the gaps can be addressed by virtual tools.That is,with virtual tools the functionality of any of the unchecked cells in Table1can be enabled.For example, stickyfingers and opposable thumbs can use a virtual tool to push or surround other objects.This is also true for the joints technique or stickyfingers alone(without opposable 3NVIDIA Corporation./physxF e a t u r e J P S F S F+OT S TL i f t(m ov e i n z)Dr a g(m ov e i n x& y)S p i n (r ot a t e abou t z)F l i p(r ot a t e abou t x/y)Pu s hT os sS u r r ound (c on t o u r)Ad d i t i ona l Poi n t sUsabl e w i t h F i ng e r sUsabl e w i t h Ob j e c t sTable1:Comparison of different techniques for interacting with3D virtual objects on a table.thumbs).A virtual tool could be used in combination with either the joints technique or the proxies technique to lift objects in the third dimension.For example,a platform could be introduced that objects could be moved onto.The platform could then be used to lift the objects through use of a dial,a slider,or elevator virtual object.Similar virtual objects could also be imagined that could enableflipping and spinning of virtual objects.Virtual tools also offer new potential for additional functionality not possible with any previous single technique.UNDERSTANDING VIRTUAL OBJECTSIn essence,the difference between the use of virtual tools and previous techniques comes down to the ability to assign richer meaning to virtual objects.This assignment of mean-ing is analogous to a similar discussion introduced by Un-derkoffler and Ishii[20]for their luminous tangible system. They showed how tangible objects could be assigned richer meaning to expand interaction possibilities.We parallel their discussion on luminous tangible object meanings with a dis-cussion on virtual tool object meanings.The discussion on virtual tool meanings is followed by a generalized model of how force-based interaction can be used to provide all this functionality by changing the complexity of how people con-trol both physical and virtual objects,as well as how those physical and virtual objects can control each other.Virtual Object MeaningsIn this section we provide examples of how using virtual objects to control other virtual objects can enrich interaction. We demonstrate this richness by mirroring Underkoffler and Ishii’s[20]description of how tangible objects can take on different object meanings along the spectrum:O b j e c t d}}oFigure3:The spectrum of object meanings used in Urp[20]to describe tangible devices.136The ACM International Conference on Interactive Tabletops and Surfaces2009Figure4:Virtual objects asverbs.Figure5:Virtual objects as nouns.In each of the following subsections,wefirst state the defini-tion used by Underkoffler and Ishii to describe the differentobject meanings(with modifications so that they describe avirtual environment,instead of a tangible system)and thendescribe an example of a sticky tool whose meaning can beinterpreted using this definition.We thus show that stickytools enable virtual objects to take on all of the possiblemeanings of tangible luminous objects.Virtual Objects as Nouns“These objects occupy the center of the axis and are likelythe most obvious in their behavior.They are fully literal,inthe sense that they work in their[virtual]context very muchthe way objects‘operate’in the real world—an Object AsNoun exists in our applications simply as a representationof itself:an immutable thing,a stand-in for some extant orimaginable part of the real-world.”[20,p.392]A virtual object as a noun stands for itself—in the virtualworld,that is,for what it appears to be.Thus,if it looks like aball it should behave like a ball.In a virtual3D environment,we can render any mesh of triangles that has been modeled.Thus,rigid bodies of virtually any shape can be added tothe environment and made to interact with other rigid bodiesusing the physics engine.Thus,these virtual objects canoperate in the virtual world in a way that is similar to howthey behave in the real world.For example,a set of bowlingpins in the environment can be knocked over using a virtualbowling ball(Figure5).Figure6:Virtual objects as reconfigurable tools.Virtual Objects as Verbs“As we move to the right along the continuum,away from Ob-ject As Noun,inherent object meaning is progressively ab-stracted in favor of further—and more general—functionality...It is not understood as‘present’in the[virtual]world...but exists to act on other components that are,or on theenvironment as a whole.”[20,p.392]A virtual object as verb exists as a virtual object but embod-ies actions.That is the appearance of the object symbolizesthe possibility of an action.In our virtual environment,weinclude a cloth that embodies‘wrapping’(Figure4).Drop-ping a cloth on another object wraps that object.We leaveevidence of this wrapping by changing the affected object’scolour,providing a way to colour objects.The act of cov-ering another virtual object with a cloth can be further ab-stracted to provide a variety of different functions.We alsoprovide a lamp sticky tool that embodies the actions of shed-ding light,casting shadows,and can be used as the sundialin Urp to simulate changing the time of day.This sticky tooldiffers from the tangible device in Urp in that the lamp canbe made to disobey the law of gravity and to pass throughother objects in the environment.Virtual Objects as Reconfigurable T ools“This variety of object-function is fully abstracted away from‘objecthood’,in a way perhaps loosely analogous to a GUI’smouse-plus-pointer.”[20,p.392]A virtual object as a reconfigurable tool is an object that canbe manipulated to affect other objects.It does not stand for The ACM International Conference on Interactive Tabletops and Surfaces2009137itself as a noun,or imply an action as a verb,but instead sym-bolizes a functionality.We create a compound sticky tool consisting of a drawer object and a dial(Figure6).When a figurine is placed inside this drawer,the dial can be rotated to grow or shrink thefigurine.This compound sticky tool could be reconfigured to perform any action that involves changing a one-dimensional property of another virtual object along a continuous axis.For example,it could be used to change an object’s density or elasticity.Virtual Objects as Attributes“As we move to the left away from the center of the axis, an object is stripped of all but one of its properties,and it is this single remaining attribute that is alone considered by the system.”[20,p.392]A virtual object as attribute represents one and only one of its attributes.For an example,we create another compound sticky tool for painting the background of the environment (Figure7).This sticky tool includes a group of four buckets that each contains a different texture and a hose that extends from below the buckets.In the case of the bucket,the only attribute that matters is its texture.The shape,size,den-sity,location and all other attributes are abstracted from this virtual object.To paint the background a person selects a bucket with onefinger to activate the hose and then,with the other hand,can move the hose’s nozzle to indicate the area of the background to paint.Movement in the z-direction affects the area of influence of the hose(the farther from the background,the larger the radius of influence).Touching the texture bucket activates the texture thatflows along the hose into the environment.Virtual Objects as Pure Objects“This last category is the most extreme,and represents the final step in the process of stripping an object of more and more of its intrinsic meanings.In this case,all that matters to a[virtual]system is that the object is knowable as an object (as distinct from nothing).”[20,p.392]A virtual object as pure object is a symbol and stands for something other than itself.We create a sticky tool that allows the storage of the locations of all of thefigures inFigure8:A diagram of thefive major components of force-based inter-action.a scene to be symbolized by a pure virtual object.Which vir-tual object will perform this symbolic function is established by placing an object in a“save”drawer.Thereafter,the scene is essentially stored in this virtual object and can be reloaded by placing that samefigure in the empty environment.Thus, any virtual object can be stripped completely of its intrinsic meaning,and the locations of the remaining virtual objects can be(e.g.)“put in the dinosaur”.That is,the dinosaur now stands for the scene.Force-Based InteractionWe have introduced stickyfingers,opposable thumbs,and virtual tools as3D tabletop interaction concepts and dis-cussed them in relation to joints,proxies,particle proxies, and tangible devices.In this section,we generalize from these approaches to provide a framework that encompasses these techniques and indicates how existing functionalities in force-based interactions can be expanded.Using physical forces to control virtual objects has the ap-peal of being easy to understand and learn due to our ability to transfer knowledge from our experience in the physical world.However,in order to simulate physical behaviour in the digital world,two primary components are required:a sensing technology,and a display technology.The sensing technology takes actions from the physical world and trans-lates them into messages that can be understood by the com-puter,and the computer can then translate those messages into something virtual that can be understood as a physical reaction to the initial action.In sensing and translating this information,there are sev-eral places that the complexity of the force-based action-reaction can vary.First,new sensing technologies can be invented to be able to identify more and more complex phys-ical forces.Essentially,the computer can become better at understanding how people control physical objects(in multi-touch,through a person’sfingers or in tangible,through a person’s use of a physical object).Second,as seen in our sticky tools,Wilson et al.[21]and Agrawala et al.[1]the mapping from what is sensed to the system response can be made to include complex physics algorithms that better simulate real-world responses.Third a largely unexplored possibility is the introduction of complexity through how the system’s response propagates in the virtual environment. That is,virtual objects can control other virtual objects. These real and virtual interaction possibilities can be summa-rized by:(1)people controlling physical objects,(2)physi-138The ACM International Conference on Interactive Tabletops and Surfaces2009。
Interlaced moving image signal codingdecoding app……
Interlaced moving image signalcoding/decoding apparatus and method, and storage medium for storing coded interlaced moving image signalBACKGROUND OF THE INVENTIONThe present invention relates to an apparatus and a method forcoding/decoding a moving image (picture) and a storage medium for storing code trains (bit stream) of the moving image. Particularly, this invention relates to such an apparatus and a method, and a storage medium for processing and storing an interlaced image signal.In coding of a moving image, an interlaced image signal is deviatedin position of scanning lines per field. Therefore, inter-image prediction and intra-image coding with respect to the interlaced image signal need be devised as compared with the case of a non-interlaced image signal.As the procedure for coding an interlaced image signal, there are some coding systems which have already been standardized.The J. 81 system of ITU-R is a system in which a field is used as a fundamental processing unit, and inter-image predication isadaptively switched between inter-frame and interfield prediction.MPEG2 and DVC are another system in which a frame is used as a fundamental processing unit, and processing per field and frame unit are switched in finer macro block unit.An example of a coding apparatus and decoding apparatus for the interlaced image described above is shown in FIG. 1.In FIG. 1, an interlaced image signal input via input terminal 1 is coded and compressed by an interlaced image coder 71 into a code train. The code train (bit stream) is supplied to an interlaced image decoder 72.The interlaced image decoder 72 is paired with the interlaced image coder 71, to reproduce the interlaced image signal that is output via output terminal 25.Further, there has been contemplated, as disclosed in Japanese Patent Laid-Open No. 3(1991)-132278 (Japanese Patent Application No.1(1989)-271006 entitled "VIDEO SIGNAL TRANSFORM APPARATUS", a method for converting an interlaced image signal of 60 fields per secondinto an interlaced image signal of 30 frames per second to providenon-interlaced image coding.However, in this case, a violent flicker occurs when 30 frames per second are displayed on a display apparatus as they are. Therefore, one frame is divided into two fields to obtain 60 fields per second before display. The coding system used here is a non-interlacedcoding system like MPEG1.In the case of an image of low resolution in a system, such as, MPEG1, only one of two fields of an interlaced image signal is used whilethe other is cancelled, to obtain a non-interlaced image signal of 30 frames per second to be coded.Further, as an image format, there has been known a progressive image of 60 frames per second. This is called a sequential scanning because scanning lines are present also on scanning line portions decimatedin an interlaced image. The fundamental scanning line construction of the progressive image is the same as that of the non-interlaced image, which can be regarded as a high frame rate non-interlaced signal.The progressive image signal can be displayed on a display apparatus without flicker but a horizontal deflection frequency or a videosignal band is doubled. Thus, the progressive image signal cannot be displayed on a usual television. As the coding system, a non-interlaced coding system like MPEG1 can be employed. However, double processing speed is necessary because the progressive image has 60 frames per second.The coding efficiency in the case where an image of each format is coded, that is, a necessary information amount (bit) per pixel will suffice to be the least in the progressive image of 60 frames per second, next in the non-interlaced image of 30 frames per second, and the interlaced image requires the largest number of codes.The coding efficiency of the interlaced image is poor due to the presence of an aliasing component included in each field image. Asviewed from the field, the interlaced image is not applied with vertical filtering fulfilled with a sampling theorem and includes many aliasing components.In the case where an image is moving, the interlaced image can be processed by motion compensation in interfield prediction. However, the folded distortion tends to change per field so that theprediction is not precise to lower the coding efficiency. The coding efficiency is further lowered due to high frequency components increase in the intra-picture processing.On the other hand, in the case of the progressive moving image of 60 frames per second and the non-interlaced moving image of 30 frames per second, the difference therebetween lies only in the frame rate. Either of the image of higher frame rate involves a closer distance between frames in time with less image change between frames. This results in precise inter-image prediction to enhance the coding efficiency. With respect to the intraframe coding, no difference is present between the progressive and the non-interlaced images.Further, in case of interlaced image signal coding, the coding efficiency is not enhanced in the processing per field. Because the field image or the predictive residue includes many aliasing components. Even in the frame/field adaptive prediction, quick images are subjected to the field processing and the situation is similar to that described above.As described above, the interlaced signal is inferior in the coding efficiency to the non-interlaced signal. Further, when the interlaced signal of 60 fields per second is converted into the non-interlaced signal of 30 frames per second, the movement of a reproduced image is not smooth as compared with an original interlaced image.SUMMARY OF THE INVENTIONAn object of the present invention to provide a moving image coding apparatus and method for coding an interlaced image signal by progressive coding after it is converted into a progressive image signal to achieve a high coding efficiency, a decoding apparatus and method for decoding the progressive-coded signal to reproduce the original interlaced image signal, and a storage medium storing the image signal obtained by the coding apparatus.The present invention provides a coding apparatus comprising: a progressive converter to interpolate scanning lines decimated from aninput interlaced image signal to convert the interlaced image signal into a first progressive image signal; a sub-sampler to sub-sample the first progressive image signal in a vertical direction to obtain a second progressive image signal with scanning lines less than scanning lines of the first progressive image signal; and a coder to encode the second progressive image signal to obtain a compressed code train.Further, the present invention provides a decoding apparatus comprising: a decoder to decode an input code train produced by progressive coding to obtain a first progressive image signal; and an image format converter to convert an image format of the first progressive image signal in a vertical direction into another format in which scanning lines are decimated to obtain an interlaced image signal.Further, the present invention provides a coding and decoding apparatus comprising: a progressive converter to interpolate scanning lines decimated from an input interlaced image signal to convert the interlaced image signal into a first progressive image signal; a sub-sampler to sub-sample the first progressive image signal in avertical direction to obtain a second progressive image signal with scanning lines less than scanning lines of the first progressive image signal; a coder to encode the second progressive image signal to obtain a compressed code train; a decoder to decode the code train to reproduce the second progressive image signal; and an image format converter to convert an image format of the second progressive image signal in a vertical direction into another format in which scanning lines are decimated to reproduce the interlaced image signal.Further, the present invention provides a coding method comprising the steps of: interpolating scanning lines decimated from an input interlaced image signal to convert the interlaced image signal into a first progressive image signal; sub-sampling the first progressive image signal in a vertical direction to obtain a second progressive image signal with scanning lines less than scanning lines of thefirst progressive image signal; and coding the second progressive image signal to obtain a compressed code train.Further, the present invention provides a decoding method comprising the steps of: decoding an input code train produced by progressive coding to obtain a first progressive image signal; and converting an image format of the first progressive image signal in a vertical direction into another format in which scanning lines are decimated to obtain an interlaced image signal.Further, the present invention provides a coding and decoding method comprising the steps of: interpolating scanning lines decimated from an input interlaced image signal to convert the interlaced image signal into a first progressive image signal; sub-sampling the first progressive image signal in a vertical direction to obtain a second progressive image signal with scanning lines less than scanning lines of the first progressive image signal; coding the second progressive image signal to obtain a compressed code train; decoding the code train to reproduce the second progressive image signal; and converting an image format of the second progressive image signal in a vertical direction into another format in which scanning lines are decimated to reproduce the interlaced image signal.Further, the present invention provides a storage medium storing a compressed code train that is obtained by interpolating scanninglines decimated from an interlaced image signal to convert the interlaced image signal into a first progressive image signal, sub-sampling the first progressive image signal in a vertical direction to obtain a second progressive image signal with scanning lines less than scanning lines of the first progressive image signal, and coding the second progressive image signal to obtain the compressed code train.BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1 is a view showing a conventional moving image coding and decoding apparatus;FIG. 2 is a block diagram showing an embodiment of an image coding apparatus according to the present invention;FIG. 3 is a block diagram showing an embodiment of an image decoding apparatus according to the present invention;FIG. 4 illustrates image formats in respective processing stages according to the present invention;FIGS. 5A and 5B are views showing the frequency characteristics before and after sub-sampling according to the present invention;FIG. 6 is a block diagram showing a progressive image coder according to the present invention; andFIG. 7 is a view showing image types in the MPEG coding according to the present invention.DESCRIPTION OF PREFERRED EMBODIMENTSAn embodiment of an image coding apparatus according to the present invention will be described hereinafter with reference to the drawings.FIG. 2 is a block diagram showing the embodiment of the image coding apparatus. An interlaced image signal input via input terminal 1 is supplied to a progressive converter 2. In the embodiment, the interlaced image signal involves 1080 (per frame), or 540 (per field) effective scanning lines.Numerals expressed below the block diagram in FIG. 2, such as,1080/60/2:1 indicate the number of effective scanninglines/frame(field) rate/interlacing ratio, respectively. Therefore, 60/2:1 indicates an interlaced image, and 60/1:1 indicates a progressive image.The progressive converter 2 converts the input interlaced imagesignal into a first progressive image signal with sequential scanning lines of multiple density by motion compensating interpolation processing to the scanning lines that have been made less in number by decimation in the input interlaced image signal. The first progressive image signal involves 60 frames per second, each having 1080 scanning lines. Such a progressive converter is disclosed in "Study of Sequential Scanning and Transform Method using Moving Compensation and Apparatus thereof" by Television Society, Technical Report BCS93-70.The first progressive image signal is subjected to sub-sampling by a vertical sub-sampler 3. The scanning lines of the first progressive image signal are sub-sampled by 2/3 to be a second image signal with 720 scanning lines.The sub-sampling is a general technique in image format conversion. The 2/3 sub-sampling is achieved by switching digital filters having three different tap factors.The second progressive image signal is coded by a progressive image coder 4. The progressive image coding is executed by raising the coding frame rate to 60 frames/second corresponding to the non-interlaced signal coding in MPEG1.When this invention is applied to MPEG2, the progressive image coder4 may not be needed because MPEG2 is capable of progressive image coding.In this way, a code train (bit stream) with the compressed second progressive image signal is obtained and output via output terminal 5. The output code train (bit stream) is transmitted to a decoding apparatus described below through a transmission line not shown. Or,it is stored in a digital storage medium shown in FIG. 2, such as, a disc and a magnetic tape, also not shown.FIG. 3 is a view showing an embodiment of an image decoding apparatus according to the present invention.The code train (bit stream) output from the coding apparatus shown in FIG. 2 is supplied to a progressive image decoder 22 via inputterminal 21. The progressive image decoder 22 reproduces the second progressive image signal by reducing the number of the scanning lines of the input code train (bit stream).The reproduced second progressive image signal is supplied to a vertical over-sampler 23 to reproduce the first progressive image signal having the original scanning lines before coded by the coding apparatus shown in FIG. 2.The reproduced first progressive image signal is supplied to an interlaced signal converter 24. The converter 24 simply decimates the scanning lines of the reproduced first progressive image signal to reproduce the interlaced image signal that has been input to the coding apparatus shown in FIG. 2. The reproduced interlaced image signal is output via output terminal 25.Here, the vertical over-sampler 23 may be configured so as not to generate scanning lines which are to be decimated and erased by the interlaced signal converter 24. This processing is one kind of the image format conversion. And, in this processing, every otherscanning line is generated by the vertical over-sampler 23 togenerate the interlaced image signal that is identical to that reproduced by the interlaced signal converter 24. In this case, the processing of the vertical over-sampler 23 is halved and theinterlaced signal converter 24 is not needed.FIG. 4 shows the transition of image format in the processing stagesof a series of coding and decoding as described above with referenceto FIGS. 2 and 3. The vertical direction shown in FIG. 4 is thevertical direction of the image processed by the coding and decoding apparatus of FIGS. 2 and 3. And, the horizontal direction indicates the change in time. Each dot represents a scanning line.In the description above, the image signal processed by the coding and decoding apparatus shown in FIGS. 2 and 3 is the interlaced image signal. However, a progressive image signal also can be processed by the coding and decoding apparatus shown in FIGS. 2 and 3. In this case, an input progressive image signal is directly supplied to the progressive image coder 4 in FIG. 2. The output signal is identical to that output via output terminal 5 of the coding apparatus of FIG.2 as described above.Therefore, in the present invention, the output compressed codetrains (bit streams) are identical to each other whether the image signal input to the coding apparatus shown in FIG. 2 is theinterlaced signal or the progressive image signal.Further, in the decoding apparatus shown in FIG. 3, when a display apparatus capable of displaying a progressive image is connected to the output terminal 25 of the decoding apparatus, the output signal of the progressive image decoder 22 or the vertical over-sampler 23 may be directly output via output terminal 25.In this way, in the present invention, the interchangeability can be improved between the interlaced and the progressive image signals.The sampling ratio in the vertical sampler 2 will be studied hereinafter.The interlaced image signal involves the same number of scanninglines as that of the non-interlaced image (progressive image) signal per frame. However, since the scanning lines of the interlaced image signal are decimated, a vertical resolution equal to the non-interlaced image signal cannot be obtained in the interlaced image signal. If the interlaced image signal is provided with a vertical high frequency component, this frequency component will not be displayed on a display apparatus as a high frequency component in the vertical direction but it will be an aliasing component and displayed as a high frequency component in the time direction to produce a violent line flicker.Usually, a TV camera for an interlaced image signal limits the frequency band of a generated interlaced image signal optically orelectrically to output an image signal with a vertically suppressed high frequency component as an interlaced image signal.A degree of the frequency band limit is called the Kell factor, andis usually about 0.7.Accordingly, an interlaced image signal converted into a progressive image signal by scanning line interpolation will be in the state limited in frequency band in the vertical direction.Thus, in the present invention, the number of scanning lines of the first progressive image signal output by the progressive converter 2 is reduced to match the Kell factor by sub-sampling processing of the vertical sub-sampler 3. Since a signal component is naturally not present in a signal band to be lost by the sub-sampling, no defect of information occurs.On the other hand, in the second progressive image signal having the number of scanning lines reduced by sub-sampling processing of the vertical sub-sampler 3, there is present signal components without vacancy within the band.FIGS. 5A and 5B show the frequency characteristics of the first progressive signal (before sub-sampling) and the second progressive signal (after sub-sampling). In the figures, the solid line indicates portions where the signal frequency components are actually present, and the dotted line indicates a transmissible band given by the sampling theorem. FIG. 5B shows that the signal frequency components actually present and the transmissible band given by the sampling theorem are identical to each other after sub-sampling.The sub-sampling ratio often takes an integer that ranges from 2/3 (=0.667) to 3/4 (=0.75), for example, for convenience in scanningline transforming. In the present embodiment, as already described above, 2/3 (=0.667) is employed as the sub-sampling ratio.The MPEG coding used in the present invention has different frame (picture) types, I, P and B, due to the difference in processing between images. The I frame (picture) is a frame independently coded in one frame, P is a one-directional predictive-coded frame and B is a bi-directional predictive-coded frame.FIG. 6 shows an example of the block diagram of the progressive image coder 4 shown in FIG. 2.Image signals (second progressive image signals) each composed of a frame output from the vertical sub-sampler 3 shown in FIG. 2 are sequentially stored in an image memory 52 via input terminal 51. Frames involved in the input image signals are replaced in accordance with the coding order and are supplied to a predictive signalsubtracter 53.The predictive signal subtracter 53 subtracts a predictive signal described later from one input image signal (one frame) to produce a predictive residue signal that is supplied to a discrete cosine transformer (DCT) 54. The transformer 54 performs discrete cosine transform to transform the predictive residue signal in pixel unit of8×8 into coefficients that are supplied to a quantizer 55.The coefficients are quantized by the quantizer 55 into fixed-length codes that are supplied to a variable-length coder 56 and an inverse-quantizer 59.A code train of variable-length codes converted from the fixed-length codes by the variable-length coder 56 is output via output terminal 5.The coefficients (the fixed-length codes) are reproduced by theinverse-quantizer 59 by replacing the quantized coefficients with representative values. The reproduced coefficients are supplied to an inverse discrete cosine transformer (DCT) 64 that performs theinverse processing of the discrete cosine transformer 54. The reproduced coefficients are then transformed into a reproductive residue signal that is supplied to a predictive signal adder 63.The predictive signal adder 63 adds the predictive signal and the reproductive residue signal to reproduce the image signal that is supplied to an image memory 62. An output of the image memory 62 is supplied to a motion compensator 61 and motion-compensated in accordance with motion vector information described later. Themotion-compensated image signal is supplied to a switch 58 as the predictive signal. The motion compensation is simple withoutswitching between field and frame processing, unlike the interlaced image coder 71 shown in FIG. 1.The switch 58 operates such that, in the P and B frames, thepredictive signal is supplied to the predictive signal subtracter 53 and the predictive signal adder 63, while in the I frame, the value zero is supplied thereto, in accordance with a control signal from an image type controller 57.The image memories 52 and 62 replace frames according to the coding order depending on whether the input image signal is a P or B frame in accordance with the control signal from the image type controller 57.The motion vector information is obtained from the input image signal by a motion estimater 60 in accordance with the input frame type.The image type controller 57 counts input frames whereby one frame out of a few frames is decided as a P frame, and one out of scores of frames is decided as an I frame. Most generally, the interval between two P frames is usually 3 frames, but in the present invention, there are used 2, 4 and 6 frames that are a multiple of 2 for the reasons mentioned below.In the MPEG coding system as described, a P frame is subjected to a recursive prediction in which a reproduced image is used for other frame predictions, and B frame is subjected to a non-recursive prediction in which a reproduced image is not used for prediction.Thus, when quality of the P frame is made higher than the B frame, the entire image quality can be improved. To this end, an image of a P frame should not be changed so that P frame interframe prediction can be carried out adequately.However, in the progressive image signal obtained by the progressive converter 2 shown in FIG. 2, a positional relation between anoriginal input scanning line and an interpolated scanning line is reversed depending on whether an original input field is an even-number or odd-number field.That is, the scanning line construction is the same in frames but the interpolation is not completely carried out. Therefore, the image itself of each frame is somewhat different depending on the original fields.Thus, the frames composed of the same type of fields processed by the progressive converter 2 shown in FIG. 2 are decided to be P frames by the image type controller 57 shown in FIG. 6. This is achieved by synchronizing the interval between two P frames with the interlacing ratio (usually, two). More specifically, it will be an integer fold (for example, 2, 4, 6) of the interlacing ratio.FIG. 7 schematically shows the frame types decided by the frame type controller 57 shown in FIG. 6 in the case of MPEG coding.In FIG. 6, a symbol .largecircle. indicates a scanning line present from the original image signal input to the coding apparatus of FIG.2 and a symbol Δ a scanning line interpolated by the progressive converter 2 of FIG. 2. Indicated under the symbols are image types of frames where the upper stage indicates that a P frame is set per 2 frames, the middle stage 4 frames and the lower stage 6 frames.Accordingly, P frames will always be frames composed of even number- or odd number-fields, being hard to be affected by the interpolation processing of the progressive converter 2 where the interpolation may be somewhat different in even number- or odd number-fields.According to the present invention, the number of scanning lines is reduced after an interlaced image has been converted into a progressive image, and the image is subjected to the progressive coding whereby the image of the frame is always coded. This eliminates aliasing components of field images that have posed a problem in the conventional interlaced image coding, thus enabling obtaining a high coding efficiency.Further, according to the present invention, since the number of scanning lines is reduced, the processing amount with respect to the coding of images subjected to the progressive conversion is made less and so the code amount is reduced.Further, according to the present invention, the conversion ratio of the number of scanning lines by the sub-sampling is adjusted to the Kell factor whereby coding can be performed without defecting image information of input interlaced signal.Furthermore, according to the present invention, the interval of one-directional interframe prediction is made to be an integer fold of the interlacing ratio in coding whereby frames for the one-directional interframe prediction which is a recursive predictionwill always be the frames in the same interpolation state. Therefore, the interframe prediction is not easily affected by the delicate difference between the frames, thus improving the coding efficiency.As described above, according to the present invention, all the image information originally possessed by the input interlaced image signal can be coded with high efficiency.Furthermore, an image is input and output in the intermediate processing stage in the present invention whereby it can cope with coding and reproduction of the progressive image. It is thereforepossible to process both interlaced and progressive images to obtain a common compressed code, thus greatly improving the interchangeability between both image formats.。
流体力学基础知识(英文)
1
Fundamentals of Ship Science MSc Course SESS6001 University of Southampton Inviscid Fluid An inviscid fluid or ideal fluid is a simplified conceptual idealisation of a real fluid. An inviscid fluid cannot support shear. Hence in the vicinity of solid boundaries the ‗no-slip‘ boundary condition of a real fluid is not realisable. This implies that continuity of fluid velocity across such boundaries is limited to the normal velocity component as ‗slippage‘ is possible in the tangential directuations of fluid motion (Lagrangian and Eulerian) are only dependent upon pressure gradient and body or gravitational influences and not upon the real fluid shear influences due to fluid viscosity. These alternative forms of motion equations are derived from first principles in a later section. Incompressible Fluid An incompressible fluid is one that neither gains nor losses mass in a selected volume V bounded by the surface S . Schematically we can think of this situation as corresponding to number of particles flowing into V corresponds to number of particles flowing out of V subject to no mass change within V . n ρv . n dS = normal mass flux from the elemental surface dS.
Image-based Facade Modeling
Image-based Fac ¸ade ModelingJianxiong XiaoTian FangPing Tan ∗Peng ZhaoEyal Ofek †Long QuanThe Hong Kong University of Science and Technology ∗National University of Singapore †MicrosoftFigure 1:A few fac¸ade modeling examples from the two sides of a street with 614captured images:some input images in the bottom row,the recovered model rendered in the middle row,and three zoomed sections of the recovered model rendered in the top row.AbstractWe propose in this paper a semi-automatic image-based approach to fac ¸ade modeling that uses images captured along streets and re-lies on structure from motion to recover camera positions and point clouds automatically as the initial stage for modeling.We start by considering a building fac ¸ade as a flat rectangular plane or a developable surface with an associated texture image composited from the multiple visible images.A fac ¸ade is then decomposed and structured into a Directed Acyclic Graph of rectilinear elementary patches.The decomposition is carried out top-down by a recursive subdivision,and followed by a bottom-up merging with the detec-tion of the architectural bilateral symmetry and repetitive patterns.Each subdivided patch of the flat fac ¸ade is augmented with a depth optimized using the 3D points cloud.Our system also allows for an easy user feedback in the 2D image space for the proposed decom-position and augmentation.Finally,our approach is demonstrated on a large number of fac ¸ades from a variety of street-side images.CR Categories:I.3.5[Computer Graphics]:Computational ge-ometry and object modeling—Modeling packages;I.4.5[ImageProcessing and computer vision]:Reconstruction.Keywords:Image-based modeling,building modeling,fac ¸ade modeling,city modeling,photography.1IntroductionThere is a strong demand for the photo-realistic modeling of cities for games,movies and map services such as in Google Earth and Microsoft Virtual Earth.However,most work has been done on large-scale aerial photography-based city modeling.When we zoom to ground level,the viewing experience is often disappoint-ing,with blurry models with few details.On the other hand,many potential applications require street-level representation of cities,where most of our daily activities take place.In term of spatial con-straints,the coverage of ground-level images is close-range.More data need to be captured and processed.This makes street-side modeling much more technically challenging.The current state of the art ranges from pure synthetic methods such as artificial synthesis of buildings based on grammar rules [M¨u ller et al.2006],3D scanning of street fac ¸ades [Fr¨u h and Zakhor 2003],to image-based approaches [Debevec et al.1996].M¨u ller et al.[2007]required manual assignment of depths to the fac ¸ade as they have only one image.However,we do have information from the reconstructed 3D points to automatically infer the critical depth of each primitive.Fr¨u h and Zakhor [2003]required tedious 3D scan-ning,while Debevec et al.[1996]proposed the method for a small set of images that cannot be scaled up well for large scale modelingACM Transaction on Graphics (TOG)Proceedings of SIGGRAPH Asia 2008Figure 2:Overview of the semi-automatic approach to image-based fac ¸ade modeling.of buildings.We propose a semi-automatic method to reconstruct 3D fac ¸ademodels of high visual quality from multiple ground-level street-view images.The key innovation of our approach is the intro-duction of a systematic and automatic decomposition scheme of fac ¸ades for both analysis and reconstruction.The decomposition is achieved through a recursive subdivision that preserves the archi-tectural structure to obtain a Directed Acyclic Graph representation of the fac ¸de by both top-down subdivision and bottom-up merging with local bilateral symmetries to handle repetitive patterns.This representation naturally encodes the architectural shape prior of a fac ¸ade and enables the depth of the fac ¸ade to be optimally com-puted on the surface and at the level of the subdivided regions.We also introduce a simple and intuitive user interface that assists the user to provide feedback on fac ¸ade partition.2Related workThere is a large amount of literature on fac ¸ade,building and archi-tectural modeling.We classify these studies as rule-based,image-based and vision-based modeling approaches.Rule-based methods.The procedural modeling of buildings specifies a set of rules along the lines of L-system.The methods in [Wonka et al.2003;M¨u ller et al.2006]are typical examples of procedural modeling.In general,procedural modeling needs expert specifications of the rules and may be limited in the realism of re-sulting models and their variations.Furthermore,it is very difficult to define the needed rules to generate exact existing buildings.Image-based methods.Image-based methods use images asguide to generate models of architectures interactively.Fac ¸ade de-veloped by Debevec et al.[1996]is a seminal work in this cate-gory.However,the required manual selection of features and the correspondence in different views is tedious,and cannot be scaled up well.M¨u ller et al.[2007]used the limited domain of regular fac ¸ades to highlight the importance of the windows in an architec-tural setting with one single image to create an impressive result of a building fac ¸ade while depth is manually assigned.Although,this technique is good for modeling regular buildings,it is limited to simple repetitive fac ¸ades and cannot be applicable to street-view data as in Figure 1.Oh et al.[2001]presented an interactive sys-tem to create models from a single image.They also manually as-signed the depth based on a painting metaphor.van den Hengel et al.[2007]used a sketching approach in one (or more)image.Although this method is quite general,it is also difficult to scale up for large-scale reconstruction due to the heavy manual interac-tion.There are also a few manual modeling solutions on the market,such as Adobe Canoma,RealViz ImageModeler,Eos Systems Pho-toModeler and The Pixel Farm PFTrack,which all require tedious manual model parameterizations and point correspondences.Vision-based methods.Vision-based methods automatically re-construct urban scenes from images.The typical examples are thework in [Snavely et al.2006;Goesele et al.2007],[Cornelis et al.2008]and the dedicated urban modeling work pursued by Univer-sity of North Carolina at Chapel Hill and University of Kentucky (UNC/UK)[Pollefeys et al.2007]that resulted in meshes on dense stereo reconstruction.Proper modeling with man-made structural constraints from reconstructed point clouds and stereo data has not yet been addressed.Werner and Zisserman [2002]used line seg-ments to reconstruct buildings.Dick et al.[2004]developed 3D architectural modeling from short image sequences.The approach is Bayesian and model based,but it relies on many specific archi-tectural rules and model parameters.Lukas et al.[2006;2008]developed a complete system of urban scene modeling based on aerial images.The result looks good from the top view,but not from the ground level.Our approach is therefore complementary to their system such that the street level details are added.Fr¨u h and Zakhor [2003]also used a combination of aerial imagery,ground color and LIDAR scans to construct models of fac ¸ades.However,like stereo methods,it suffers from the lack of representation for the styles in man-made architectures.Agarwala et al.[2006]composed panoramas of roughly planar scenes without producing 3D models.3OverviewOur approach is schematized in Figure 2.SFM From the captured sequence of overlapping images,we first automatically compute the structure from motion to obtain a set of semi-dense 3D points and all camera positions.We then register the reconstruction with an existing approximate model of the buildings (often recovered from the real images)using GPS data if provided or manually if geo-registration information is not available.Fac ¸ade initialization We start a building fac ¸ade as a flat rectangular plane or a developable surface that is obtained either automatically from the geo-registered approximate building model or we manu-ally mark up a line segment or a curve on the projected 3D points onto the ground plane.The texture image of the flat fac ¸ade is com-puted from the multiple visible images of the fac ¸ade.The detection of occluding objects in the texture composition is possible thanks to the multiple images with parallaxes.Fac ¸ade decomposition A fac ¸ade is then systematically decom-posed into a partition of rectangular patches based on the horizontal and vertical lines detected in the texture image.The decomposition is carried out top-down by a recursive subdivision and followed by a bottom-up merging,with detection of the architectural bilateral symmetry and repetitive patterns.The partition is finally structured into a Directed Acyclic Graph of rectilinear elementary patches.We also allow the user to edit the partition by simply adding and removing horizontal and vertical lines.Fac¸ade augmentation Each subdivided patch of theflat fac¸ade is augmented with the depth obtained from the MAP estimation of the Markov Random Field with the data cost defined by the3D points from the structure from motion.Fac¸ade completion Thefinal fac¸ade geometry is automatically re-textured from all input images.Our main technical contribution is the introduction of a systematic decomposition schema of the fac¸ade that is structured into a Direct Acyclic Graph and implemented as a top-down recursive subdivi-sion and bottom-up merging.This representation strongly embeds the architectural prior of the fac¸ades and buildings into different stages of modeling.The proposed optimization for fac¸ade depth is also unique in that it operates in the fac¸ade surface and in the super-pixel level of a whole subdivision region.4Image CollectionImage capturing We use a camera that usually faces orthogonal to the building fac¸ade and moves laterally along the streets.The camera should preferably be held straight and the neighboring two views should have sufficient overlapping to make the feature corre-spondences computable.The density and the accuracy of the recon-structed points vary,depending on the distance between the camera and the objects,and the distance between the neighboring viewing positions.Structure from motion Wefirst compute point correspondences and structure from motion for a given sequence of images.There are standard computer vision techniques for structure from mo-tion[Hartley and Zisserman2004].We use the approach described in[Lhuillier and Quan2005]to compute the camera poses and a semi-dense set of3D point clouds in space.This technique is used because it has been shown to be robust and capable of providingsufficient point clouds for object modelingpurposes.(a)(b)(c)Figure3:A simple fac¸ade can be initialized from aflat rectangle (a),a cylindrical portion(b)or a developable surface(c).5Fac¸ade InitializationIn this paper,we consider that a fac¸ade has a dominant planar struc-ture.Therefore,a fac¸ade is aflat plane with a depthfield on the plane.We also expect and assume that the depth variation within a simple fac¸ade is moderate.A real building fac¸ade having complex geometry and topology could therefore be broken down into mul-tiple simple fac¸ades.A building is merely a collection of fac¸ades, and a street is a collection of buildings.The dominant plane of the majority of the fac¸ades isflat,but it can be curved sometimes as well.We also consider the dominant surface structure to be any cylinder portion or any developable surface that can be swept by a straight line as illustrated in Figure3.To ease the description,but without loss of generality,we use aflat fac¸ade in the remainder of the paper.For the developable surface,the same methods as forflat fac¸ades in all steps are used,with trivial surface parameterizations. Some cylindrical fac¸ade examples are given in the experiments.Algorithm1Photo Consistency Check For Occlusion Detection Require:A set of N image patches P={p1,p2,...,p N}cor-responding to the projections{x i}of the3D point X. Require:η∈[0,1]to indicate when two patches are similar.1:for all p i∈P do2:s i←0⊲Accumulated similarity for p i 3:for all p j∈P do4:s ij←NCC(p i,p j)5:if s ij>ηthen s i←s i+s ij6:end if7:end for8:end for9: n←arg max i s i⊲ n is the patch with best support 10:V←∅⊲V is the index set with visible projection 11:O←∅⊲V is the index set with occluded projection 12:for all p i∈P do13:if s i n>ηthen V←V∪{i}14:else O←O∪{i}15:end if16:end for17:return V and O5.1Initial Flat RectangleThe reference system of the3D reconstruction can be geo-registered using GPS data of the camera if available or using an interactive technique.Illustrated in Figure2,the fac¸ade modeling process can begin with an existing approximate model of the build-ings often reconstructed from areal images,such as publicly avail-able from Google Earth and Microsoft Virtual Earth.Alternatively, if no such approximate model exists,a simple manual process in the current implementation is used to segment the fac¸ades,based on the projections of the3D points to the groundfloor.We draw a line segment or a curve on the ground to mark up a fac¸ade plane as aflat rectangle or a developable surface portion.The plane or surface position is automaticallyfitted to the3D points or manually adjusted if necessary.5.2Texture CompositionThe geometry of the fac¸ade is initialized as aflat ually, a fac¸ade is too big to be entirely observable in one input image.We first compose a texture image for the entire rectangle of the fac¸ade from the input images.This process is different from image mo-saic,as the images have parallax,which is helpful for removing the undesired occluding objects such as pedestrians,cars,trees,tele-graph poles and trash cans,that lies in front of the target fac¸ade. Furthermore,the fac¸ade plane position is known,compared with an unknown spatial position in stereo algorithms.Hence,the photo consistency constraint is more efficient and robust for occluding object removal,with a better texture image than a pure mosaic. Multi-view occlusion removal As in many multiple view stereo methods,photo consistency is defined as follows.Consider a 3D point X=(x,y,z,1)′with color c.If it has a projection, x i=(u i,v i,1)′=P i X in the i-th camera P i,under the Lam-bertian surface assumption,the projection x i should also have the same color,c.However,if the point is occluded by some other ob-jects in this camera,the color of the projection is usually not the same as c.Note that c is unknown.Assuming that point X is visible from multiple cameras,I={P i},and occluded by some objects in the other cameras,I′={P j},then the color,c i,of the projections in I should be the same as c,while it may be differ-ent from the color,c j,of projections in I′.Now,given a set of projection colors,{c k},the task is to identify a set,O,of the oc-(a)Indicate(b)Remove(c)Inpaint(d)Guide(e)ResultFigure4:Interactive texture refinement:(a)drawn strokes on theobject to indicate removal.(b)the object is removed.(c)automati-cally inpainting.(d)some green lines drawn to guide the structure.(e)better result achieved with the guide lines.cluded cameras.In most situations,we can assume that point X isvisible from most of the cameras.Under this assumption,we have c≈median k{c k}.Given the estimated color of the3D point c,it is now very easy to identify the occluded set,O,according to theirdistances with c.To improve the robustness,instead of a singlecolor,the image patches centered at the projections are used,andpatch similarity,normalized cross correlation(NCC),is used as ametric.The details are presented in Algorithm1.In this way,withthe assumption that the fac¸ade is almost planar,each pixel of thereference texture corresponds to a point that lies on theflat fac¸ade.Hence,for each pixel,we can identify whether it is occluded in aparticular camera.Now,for a given planar fac¸ade in space,all vis-ible images arefirst sorted according to the fronto-parallelism ofthe images with respect to the given fac¸ade.An image is said tobe more fronto-parallel if the projected surface of the fac¸ade in theimage is larger.The reference image isfirst warped from the mostfronto-parallel image,then from the lesser ones according to thevisibility of the point.Inpainting In each step,due to existence of occluding objects,some regions of the reference texture image may still be left empty.In a later step,if an empty region is not occluded and visible fromthe new camera,the region isfilled.In this way of a multi-viewinpainting,the occluded region isfilled from each single camera.At the end of the process,if some regions are still empty,a nor-mal image inpainting technique is used tofill it either automatically[Criminisi et al.2003]or interactively as described in Section5.3.Since we have adjusted the cameras according to the image corre-spondences during bundle adjustment of structure from motion,thissimple mosaic without explicit blending can already produce veryvisually pleasing results.5.3Interactive RefinementAs shown in Figure4,if the automatic texture composition result isnot satisfactory,a two-step interactive user interface is provided forrefinement.In thefirst step,the user can draw strokes to indicatewhich object or part of the texture is undesirable as in Figure4(a).The corresponding region is automatically extracted based on theinput strokes as in Figure4(b)using the method in[Li et al.2004].The removal operation can be interpreted as that the most fronto-parallel and photo-consistent texture selection,from the result ofAlgorithm1,is not what the user wants.For each pixel, n fromLine9of Algorithm1and V should be wrong.Hence,P is up-dated to exclude V:P←O.Then,if P=∅,Algorithm1isrun again.Otherwise,image inpainting[Criminisi et al.2003]isused for automatically inpainting as in Figure4(c).In the secondstep,if the automatic texturefilling is poor,the user can manuallyspecify important missing structural information by extending a fewcurves or line segments from the known to the unknown regions asin Figure4(d).Then,as in[Sun et al.2005],image patches are syn-thesized along these user-specified curves in the unknown regionusing patches selected around the curves in the known region byLoopy Belief Propagation tofind the optimal patches.After com-pleting the structural propagation,the remaining unknownregions(a)Input(b)Structure(c)WeightACB DEHF G(d)SubdivideM(e)Merge Figure5:Structure preserving subdivision.The hidden structure of the fac¸ade is extracted out to form a grid in(b).Such hypotheses are evaluated according to the edge support in(c),and the fac¸ade is recursively subdivided into several regions in(d).Since there is not enough support between Regions A,B,C,D,E,F,G,H,they are all merged into one single region M in(e).arefilled using patch-based texture synthesis as in Figure4(e).6Fac¸ade DecompositionBy decomposing a fac¸ade we try to best describe the faced struc-ture,by segmenting it to a minimal number of elements.The fac¸ades that we are considering inherit the natural horizontal and vertical directions by construction.In thefirst approximation,we may take all visible horizontal and vertical lines to construct an ir-regular partition of the fac¸ade plane into rectangles of various sizes. This partition captures the global rectilinear structure of the fac¸ades and buildings and also keeps all discontinuities of the fac¸ade sub-structures.This usually gives an over-segmentation of the image into patches.But this over-segmentation has several advantages. The over-segmenting lines can also be regarded as auxiliary lines that regularize the compositional units of the fac¸ades and buildings. Some’hidden’rectilinear structures of the fac¸ade during the con-struction can also be rediscovered by this over-segmentation pro-cess.6.1Hidden Structure DiscoveryTo discover the structure inside the fac¸ade,the edge of the reference texture image isfirst detected[Canny1986].With such edge maps, Hough transform[Duda and Hart1972]is used to recover the lines. To improve the robustness,the direction of the Hough transform is constrained to only horizontal and vertical,which happens in most architectural fac¸ades.The detected lines now form a grid to parti-tion the whole reference image,and this grid contains many non-overlapping short line segments by taking intersections of Hough lines as endpoints as in Figure5(b).These line segments are now the hypothesis to partition the fac¸ade.The Hough transformation is good for structure discovery since it can extract the hidden global information from the fac¸ade and align line segments to this hidden structure.However,some line segments in the formed grid may not really be a partition boundary between different regions.Hence,the weight,w e,is defined for each line segment,e,to indicate the like-lihood that this line segment is a boundary of two different regions as shown in Figure5(c).This weight is computed as the number of edge points from the Canny edge map covered by the line segment.Remark on over-segmented partition It is true that the current partition schema is subject to segmentation parameters.But it is important to note that usually a slightly over-segmented partition is not harmful for the purpose of modeling.A perfect partition cer-tainly eases the regularization of the fac¸ade augmentation by depth as presented in the next section.Nevertheless,an imperfect,partic-ularly a slight over-segmented partition,does not affect the model-ing results when the3D points are dense and the optimization works well.(a)Edge weight support(b)Regional statistics supportFigure 6:Merging support evaluation.6.2Recursive SubdivisionGiven a region,D ,in the texture image,it is divided into two sub rectangular regions,D 1and D 2,such that D =D 1∪D 2,by a line segment L with strongest support from the edge points.After D is subdivided into two separate regions,the subdivision procedures continue on the two regions,D 1and D 2,recursively.The recursive subdivision procedure is stopped if either the target region,D ,is too small to be subdivided,or there is not enough support for a division hypothesis,i.e.,region D is very smooth.For a fac ¸ade,the bilateral symmetry about a vertical axis may not exist for the whole fac ¸ade,but it exists locally and can be used for more robust subdivision.First,for each region,D ,the NCC score,s D ,of the two halves,D 1and D 2,vertically divided at the center of D is computed.If s D >η,region D is considered to have bilateral symmetry.Then,the edge map of D 1and D 2are averaged,and subdivision is recursively done on D 1only.Finally,the subdivision in D 1is reflected across the axis to become the subdivision of D 2,and merged the two subdivisions into the subdivision of D .Recursive subdivision is good to preserve boundaries for man-made structural styles.However,it may produce some unnecessary fragments for depth computation and rendering as in Figure 5(d).Hence,as a post-processing,if two neighboring leaf subdivision re-gions,A and B ,has not enough support,s AB ,to separate them,they are merged into one region.The support,s AB ,to separate two neighbor regions,A and B ,is defined to be the strongest weight of all the line segments on the border between A and B :s AB =max e {w e }.However,the weights of line segments can only offer a local image statistic on the border.To improve the ro-bustness,a dual information region statistic between A and B can be used more globally.As in Figure 6,Since regions A and B may not have the same size,this region statistic similarity is defined as follows:First,an axis is defined on the border between A and B ,and region B is mirrored on this axis to have a region,−→B .The over-lapped region,A ∩−→B between A and −→B is defined to be the pixelsfrom A with locations inside −→B .In a similar way,←−A ∩B containsthe pixels from B with locations inside ←−A ,and then it is mirrored tobecome −−−−→←−A ∩B according the the same axis.The normalized crosscorrelation (NCC)between A ∩−→B and −−−−→←−A ∩B is used to define the regional similarity of A and B .In this way,only the symmetric part of A and B is used for region comparison.Therefore,the effect of the other far-away parts of the region is avoided,which will happen if the size of A and B is dramatically different and global statistics,such as the color histogram,are used.Weighted by a parameter,κ,the support,s AB ,to separate two neighboring regions,A and B ,is now defined ass AB =max e{w e }−κNCC (A ∩−→B ,−−−−→←−A ∩B ).Note that the representation of the fac ¸ade is a binary recursive tree before merging and a Directed Acyclic Graph (DAG)after region merging.The DAG representation can innately support the Level of Detail rendering technique.When great details are demanded,the rendering engine can go down the rendering graph to expand all detailed leaves and render them correspondingly.Vice versa,the(x 1,y 1)(x 4,y 4)(x 2,y 2)(x 3,y 3)(a)Fac ¸ade(b)DAGFigure 7:A DAG for repetitive pattern representation.intermediate node is rendered and all its descendents are pruned atrendering time.6.3Repetitive Pattern RepresentationThe repetitive patterns of a fac ¸ade locally exist in many fac ¸ades and most of them are windows.[M¨u ller et al.2007]used a compli-cated technique for synchronization of subdivisions between differ-ent windows.To save storage space and to ease the synchroniza-tion task,in our method,only one subdivision representation for the same type of windows is maintained.Precisely,a window tem-plate is first detected by a trained model [Berg et al.2007]or man-ually indicated on the texture images.The templates are matched across the reference texture image using NCC as the measurement.If good matches exist,they are aligned to the horizontal or vertical direction by a hierarchical clustering,and the Canny edge maps on these regions are averaged.During the subdivision,each matched region is isolated by shrinking a bounding rectangle on the average edge maps until it is snapped to strong edges,and it is regarded as a whole leaf region.The edges inside these isolated regions should not affect the global structure,and hence these edge points are not used during the global subdivision procedure.Then,as in Figure 7,all the matched leaf regions are linked to the root of a common subdivision DAG for that type of window,by introducing 2D trans-lation nodes for the pivot position.Recursive subdivision is again executed on the average edge maps of all matched regions.To pre-serve photo realism,the textures in these regions are not shared and only the subdivision DAG and their respective depths are shared.Furthermore,to improve the robustness of the subdivision,the ver-tical bilateral symmetric is taken as a hard constraint for windows.6.4Interactive Subdivision RefinementIn most situations,the automatic subdivision works satisfactorily.If the user wants to refine the subdivision layout further,three line op-erations and two region operations are provided.The current auto-matic subdivision operates on the horizontal and vertical directions for robustness and simplicity.The fifth ‘carve’operator allows the user to sketch arbitrarily shaped objects manually,which appear less frequently,to be included in the fac ¸ade representation.AddIn an existing region,the user can sketch a stroke to indi-cate the partition as in Figure 8(a).The edge points near the stroke are forced to become salient,and hence the subdivision engine can figure the line segment out and partition the region.DeleteThe user can sketch a zigzag stroke to cross out a linesegment as in Figure 8(b).ChangeThe user can first delete the partition line segments andthen add a new line segment.Alternatively,the user can directly sketch a stroke.Then,the line segment across by the stroke will be deleted and a new line segment will be constructed accordingly as in Figure 8(c).After the operation,all descendants with the target。
英文翻译
2D-LDA:A statistical linear discriminant analysisfor image matrixMing Li *,Baozong YuanInstitute of Information Science,Beijing Jiaotong University,Beijing 100044,ChinaReceived 19August 2004Available online 18October 2004AbstractThis paper proposes an innovative algorithm named 2D-LDA,which directly extracts the proper features from image matrices based on Fisher Õs Linear Discriminant Analysis.We experimentally compare 2D-LDA to other feature extraction methods,such as 2D-PCA,Eigenfaces and Fisherfaces.And 2D-LDA achieves the best performance.Ó2004Elsevier B.V.All rights reserved.Keywords:Feature extraction;Image representation;Linear discriminant analysis;Subspace techniques;Face recognition1.IntroductionFeature extraction is the key to face recogni-tion,as it is to any pattern classification task.The aim of feature extraction is reducing the dimensionality of face image so that the extracted features are as representative as possible.The class of image analysis methods called appearance-based approach has been of wide concern,which relies on statistical analysis and machine learning.Turk and Pentland (1991)presented the well-known Eigenfaces method for face recognition,which uses principal component analysis (PCA)for dimensionality reduction.However,the base physical similarity of the represented images to originals does not provide the best measure of use-ful information for distinguishing faces from one another (O ÕToole,1993).Belhumeur et al.(1997)proposed Fisherfaces method,which is based on Fisher Õs Linear Discriminant and produces well separated classes in a low-dimensional subspace.His method is insensitive to large variation in lighting direction and facial expression.Recently,Yang (2002)investigated the Kernel PCA for learning low dimensional representations for face recognition and found that the Kernel methods provide better representations and achieve lower error rates for face recognition.Bartlett et al.(2002)proposed using ICA for face representation,which is sensitive to the high-order0167-8655/$-see front matter Ó2004Elsevier B.V.All rights reserved.doi:10.1016/j.patrec.2004.09.007*Corresponding author.Tel.:+861051683149;fax:+516861688616.E-mail address:liming@ (M.Li).Pattern Recognition Letters 26(2005)527–532statistics.This method is superior to representa-tions based on PCA for recognizing faces across days and changes in expression.However,Kernel PCA and ICA are both computationally more expensive than PCA.Weng et al.(2003)presented a fast method,called candid covariance-free IPCA (CCIPCA),to obtain the principal components of high-dimensional image vectors.Moghaddan (2002)compared the Bayesian subspace method with several PCA-related methods(PCA,ICA, and Kernel PCA).The experimental results dem-onstrated its superiority over PCA,ICA and Ker-nel PCA.All the PCA-related methods discussed above are based on the analysis of vectors.When dealing with images,we shouldfirstly transform the image matrixes into image vectors.Then based on these vectors the covariance matrix is calculated and the optimal projection is obtained.However,face images are high-dimensional patterns.For exam-ple,an image of112·92will form a10304-dimen-sional vector.It is difficult to evaluate the covariance matrix in such a high-dimensional vector space.To overcome the drawback,Yang proposed a straightforward image projection tech-nique,named as image principal component analy-sis(IMPCA)(Yang et al.,2004),which is directly based on analysis of original image matrices.Dif-ferent to traditional PCA,2D-PCA is based on 2D matrices rather than1D vectors.This means that the image matrix does not need to be con-verted into a vector.As a result,2D-PCA has two advantages:easier to evaluate the covariance matrix accurately and lower time-consuming.Liu et al.(1993)proposed an iterative method to calcu-late the Foley-Sammon optimal discriminant vec-tors from image matrixes.And he proposed to substitute D t=D b+D w for D w to overcome the sin-gularity problem.LiuÕs method was complicate and didnÕt resolve the singularity problem well.In this paper,a statistical linear discriminant analysis for image matrix is discussed.Our method proposes to use Fisher linear projection criterion tofind out a good projection.This crite-rion is based on two parameters:the between-class scatter matrix and the within-class scatter matrix. Because the dimension of between-class and within-class scatter matrix is much low(compara-tive to number of training samples).So,the prob-lem,that the within-class scatter matrix maybe singular,will be handled.At the same time,the compute-costing is lower than traditional Fisher-faces.Moreover,we discuss about image recon-struction and conduct a series of experiments on the ORL face database.The organization of this paper is as follows:In Section2,we propose the idea and describe the algorithm in detail.In Section3,we compare 2D-LDA with Eigenfaces,Fisherfaces and2D-PCA on the ORL face database.Finally,the paper concludes with some discussions in Section4.2.Two-dimensional linear discriminant analysis 2.1.Principle:The construction of Fisher projection axisLet A denotes a m·n image,and x denotes an n-dimensional column vector.A is projected onto x by the following linear transformationy¼Ax:ð1ÞThus,we get an m-dimensional projected vector y,which is called the feature vector of the image A.Suppose there are L known pattern classes in the training set,and M denotes the size of the training set.The j th training image is denoted by an m·n matrix A j(j=1,2,...,M),and the mean image of all training sample is denoted by A and A iði¼1;2;...;LÞdenoted the mean image of class T i and N i is the number of samples in class T i,the projected class is P i.After the projection of training image onto x,we get the projected fea-ture vectoryj¼A j x;j¼1;2;...;M:ð2ÞHow do we judge a projection vector x is good? In fact,the total scatter of the projected samples can be characterized by the trace of the covariance matrix of the projected feature vectors(Turk and Pentland,1991).From this point of view,we intro-duced a criterion atfirst,JðxÞ¼P BP W:ð3Þ528M.Li,B.Yuan/Pattern Recognition Letters26(2005)527–532There were two parametersP B¼trðTS BÞ;ð4ÞP W¼trðTS WÞ;ð5Þwhere TS B denotes the between-class scatter matrix of projected feature vectors of training images,and TS W denotes the within-class scatter matrix of projected feature vectors of training images.So,TS B¼X Li¼1N ið y iÀ yÞð y iÀ yÞT¼X Li¼1N i½ðA iÀAÞx ½ðA iÀAÞx T;ð6ÞTS W¼X Li¼1Xy k2P iðy kÀ y iÞðy kÀy iÞT¼X Li¼1Xy k2P i½ðA kÀA iÞx ½ðA kÀA iÞx T:ð7ÞSotrðTS BÞ¼x TX Li¼1N iðA iÀAÞTðA iÀAÞ!x¼x T S B x;ð8ÞtrðTS WÞ¼x TX Li¼1XA k2T iðA kÀA iÞTðA kÀA iÞ!x¼x T S W x:ð9ÞWe could evaluate TS B and TS W directly using the training image samples.So,the criterion could be expressed byJðxÞ¼x T S B xx T S W x;ð10Þwhere x is a unitary column vector.This criterion is called Fisher linear projection criterion.The unitary vector x that maximizes J(x)is called the Fisher optimal projection axis.The optimal projection x opt is chosen when the criterion is maximized,i.e.,x opt¼arg maxxJðxÞ:ð11ÞIf S W is nonsingular,the solution to above opti-mization problem is to solve the generalized eigen-value problem(Turk and Pentland,1991):S B x opt¼k S W x opt:ð12ÞIn the above equation,k is the maximal eigenvalue of SÀ1WS B.The traditional LDA must face to the singularity problem.However,2D-LDA overcomes this prob-lem successfully.This is because:For each training image,A j(j=1,2,...,M),we have rank(A j)= min(m,n).From(9),we haverankðS WÞ¼rankX Li¼1XA k2T iðA kÀA iÞTðA kÀA iÞ!6ðMÀLÞÁminðm;nÞ:ð13ÞSo,in2D-LDA,S W is nonsingular whenM P Lþnminðm;nÞ:ð14ÞIn real situation,(14)is always satisfied.So,S W is always nonsingular.In general,it is not enough to have only one Fisher optimal projection axis.We usually need to select a set of projection axis,x1,...,x d,subject to the orthonormal constraints.That is,f x1;...;x d g¼arg max JðxÞx Tix j¼0;i¼j;i;j¼1;...;d:ð15ÞIn fact,the optimal projection axes,x1,...,x d,are the orthonormal eigenvectors of SÀ1WS B corre-sponding to thefirst d largest eigenvalues.Using these projection axes,we could form a new Fisher projection matrix X,which is a n·d matrix,X¼½x1x2ÁÁÁx d :ð16Þ2.2.Feature extractionWe will use the optimal projection vectors of 2D-LDA,x1,...,x d,for feature extraction.For a given image A,we haveyk¼Ax k;k¼1;2;...;d:ð17ÞThen,we have a family of Fisher feature vectors y1,...,y d,which formed a m·d matrix Y=[y1,...,y d].We called this matrix Y as the Fisher feature matrix of the image A.M.Li,B.Yuan/Pattern Recognition Letters26(2005)527–5325292.3.ReconstructionIn the2D-LDA method,we can use the Fisher feature matrixes and Fisher optimal projection axes to reconstruct a image by following steps.For a given image A,the Fisher feature matrix is Y=[y1,...,y d]and the Fisher optimal projection axes X=[x1,...,x d],then we haveY¼AX:ð18ÞBecause x1,...,x d are orthonormal,it is easy to obtain the reconstructed image of A:e A¼YX T¼X dk¼1ykx Tk:ð19ÞWe called e A k¼y k x T k as a reconstructed subimage of A,which have the same size as image A.This means that we use a set of2D Fisherfaces to reconstruct the original image.If we select d=n,then we can completely reconstruct the images in the training set:e A¼A.If d<n,the reconstructed image e A is an approximation for A.2.4.ClassificationGiven two images A1,A2represented by2D-LDA feature matrix Y1¼½y11;...;y1dandY2¼½y21;...;y2d.So the similarity d(Y1,Y2)is de-fined asdðY1;Y2Þ¼X dk¼1k y1kÀy2kk2;ð20Þwhere k y1k Ày2kk2denotes the Euclidean distancebetween the two Fisher feature vectors y1k and y2k.If the Fisher feature matrix of training images are Y1,Y2,...,Y M(M is the total number of train-ing images),and each image is assigned to a class T i.Then,for a given test image Y,if d(Y,Y l)=min j d(Y1,Y j)and Y l2T i,then the resulting deci-sion is Y2T i.3.Experiment and analysisWe evaluated our2D-LDA algorithm on the ORL face image database.The ORL database ()contains images of40 individuals,each person have10different images. For some individuals,the images were taken at different times.The facial expression(open or closed eyes,smiling or nonsmiling)and facial de-tails(glasses or no glasses)also vary.All the images were taken against a dark homogeneous background with the subjects in an upright,frontal position(with tolerance for some side movement). The images were taken with a tolerance for some titling and rotation of the face of up to20°.More-over,there is also some variation in the scale of up to about10%.The size of each image is92·112 pixels,with256grey levels per pixel.Five samples of one person in ORL database are shown in Fig.1.So,we could use the ORL:database to evaluate2D-LDAÕs performance under conditions where pose and the size of sample are varied.Using2D-LDA,we could project the test face image onto the Fisher optimal projection axis, then we could use the Fisher feature vectors set to reconstruct the image.In Fig.2,some recon-structed images and the original image of one per-son were given out.In Fig.2,the variance d denotes the number of dimension used to map and reconstruct the face image.As observed these images,we couldfind that the reconstructed images are very like obtained by sample the origi-nal image on the spacing vertical scanning line. The reconstructed image e A is more and more like to the original image A as the value of d increased.We have done an experiment on the ORL data-base to evaluate performance of2D-LDA,2D-Fig.1.Five images in ORL database.530M.Li,B.Yuan/Pattern Recognition Letters26(2005)527–532PCA (Yang et al.,2004),Eigenfaces (Turk and Pentland,1991),Fisherfaces (Belhumeur et al.,1997).To evaluate the pure ability of these four in the fair environment,we did not do any prepro-cess on the face images,and we did not utilize any optimized algorithm.We just realized the algo-rithm appeared in the literature (Turk and Pent-land,1991;Belhumeur et al.,1997and Yang et al.,2004)without any modification.In our experiment,we select first five images samples per person for training,and the left five images samples for testing.So,in our experiment,the size of training set and testing set were both 200.So in the 2D-LDA,the size of between-class scatter ma-trix S B and within-class scatter matrix S W are both 92·92.Fig.3shows the classification result.From Fig.3,we find that the recognition rate of 2D-LDA have achieved the best performance in the four methods.And the best result of 2D-LDA 94.0%is much better than the best result of 2D-PCA 92.5%.And from Fig.3,we could find that the two 2D feature extraction methods have outstanding performance in the low-dimensioncondition,but the conventional ones Õability is very poor.Table 1showed out the comparison of the train-ing time of the four algorithms (CPU:Pentium IV 2.66GHz,RAM:256M).The four algorithms are realized in the Matlab environment.We could see that the 2D-LDA and 2D-PCA Õs computing-cost is very low compared with Eigenfaces and Fisher-faces.This is because in the 2D condition,we only need to handle a 92·92matrix.But using the Eigenfaces and Fisherfaces,we must face to a 10304·10304matrix.It is a hard work.At last,It must be mentioned that when used the Fisher-faces,we must reduced the dimension of image data to avoid that S W is singular (Belhumeur et al.,1997).Mapping the original data onto how many dimensions space is a hard problem.We must select the proper number of dimension through experiment.Considering this situation,Fisherfaces is very time-costing.Table 2showed that the memory cost of 2D fea-ture extraction is much larger than the 1D ones.This is because 2D methods used a n ·d matrix to present a face image.At the same time the 1D techniques reconstructed face images by a d -dimensionvector.Fig. parison of 2D-LDA and 2D-PCA on ORLDatabase.Fig.2.Some reconstructed images of one person.Table 2Comparison of memory cost (bytes)to present a 92·112image using different techniques (15dimensions)2D-LDA 2D-PCA Eigenfaces Fisherfaces 672067206060Table 1Comparison of CPU Time (s)for feature extraction using ORL database (15dimensions)2D-LDA 2D-PCA Eigenfaces Fisherfaces 0.42100.421028.500032.5310M.Li,B.Yuan /Pattern Recognition Letters 26(2005)527–5325314.ConclusionIn this paper,a new algorithm for image feature extraction and selection was proposed.This meth-od uses the Fisher Linear Discriminant Analysis to enhance the effect of variation caused by different individuals,other than by illumination,expres-sion,orientation,etc.2D-LDA uses the image ma-trix instead of the image vector to compute the between-class scatter matrix and the within-class scatter matrix.From our experiments,we can see that the2D-LDA have many advantages over other methods. 2D-LDA achieves the best recognition accuracy in the four algorithms.And this techniqueÕs com-puting cost is very low compared with Eigenfaces and Fisherfaces,and close to2D-PCA.At the same time,this method shows powerful perfor-mance in the low dimension.From Fig.2,we can see that this new projection method is very like to select spacing vertical scanning lines to present a image.Maybe this is the reason that this algorithm is so effective in image classification.2D-LDA still has its shortcoming.It needs more memory to store a image than the Eigenfaces and Fisherfaces.AcknowledgementThis work was supported by the National Nat-ural Science Foundation of China(No.60441002)and the University Key Research Project(No. 2003SZ002).ReferencesBartlett,M.S.,Movellan,J.R.,Sejnowski,T.J.,2002.Face recognition by independent component analysis.IEEE Trans.Neural Networks13(6),1450–1464. Belhumeur,P.N.,Hespanha,J.P.,Kriegman, D.J.,1997.Eigenfaces vs.Fisherfaces:Recognition using class specific linear projection.IEEE Trans.Pattern Anal.Machine Intell.19(7),711–720.Lui,K.,Cheng,Y.,Yang,J.,1993.Algebraic feature extraction for image recognition based on an optimal discriminant criterion.Pattern Recognition26(6),903–911.Moghaddan,B.,2002.Principal manifolds and probabilistic subspaces for visual recognition.IEEE Trans.Pattern Anal.Machine Intell.24(6),780–788.OÕToole,A.,1993.Low-dimensional representation of faces in higher dimensions of the face space.J.Opt.Soc.Amer.10(3).Turk,M.,Pentland,A.,1991.Eigenfaces for recognition.J.Cognitive Neurosci.3(1),71–86.Weng,J.,Zhang,Y.,Hwang,W.,2003.Candid covariance-free incremental principal component analysis.IEEE Trans.Pattern Anal.Machine Intell.25(8),1034–1040.Yang,M.H.,2002.Kernel eigenfaces vs.Kernel Fisherfaces: Face recognition using Kernel methods.In:Proc.5th Internat.Conf.on Automatic Face and Gesture Recogni-tion(RGRÕ02).pp.215–220.Yang,J.,Zhang,D.,Frangi,A.F.,Yang,J.-y.,2004.Two-dimensional PCA:A new approach to appearance-based face representation and recognition.IEEE Trans.Pattern Anal.Machine Intell.26(1),131–137.532M.Li,B.Yuan/Pattern Recognition Letters26(2005)527–532。
METHOD AND APPARATUS FOR USE IN OPTIMIZING PHOTOG
专利名称:METHOD AND APPARATUS FOR USE IN OPTIMIZING PHOTOGRAPHIC FILMDEVELOPER PROCESSES发明人:BERG, Bernard, J.,ROOD, Patrick, S.申请号:US1995008130申请日:19950629公开号:WO96/000930P1公开日:19960111专利内容由知识产权出版社提供摘要:To provide a standard for determining an objective level of performance of photographic film developer processes, a production sensitometer, of the type commonly used in the field, is correlated with a high precision, master sensitometer, defined as a standard. Relative exposure values are computed for each step of a step wedge exposed by a production sensitometer with reference to a corresponding step of a step wedge exposed in the master sensitometer. The relative exposure values are recorded and stored in a read-only memory in the production sensitometer. In the field, the steps of a step wedge on a test film strip exposed by the production sensitometer and developed by the developer processor to be tested, are read by a densitometer which uses the stored relative exposure values to compute density values for the test strip correlated to the master sensitometer. The developer processor may then be adjusted such that the developed film will match quality control parameters, e.g. speed index, contrast index, etc., supplied by the film supplier.申请人:X-RITE, INCORPORATED地址:3100 - 44th Street, S.W. Grandville, MI 49418 US国籍:US代理机构:VISSERMAN, Peter 更多信息请下载全文后查看。
Reflection Detection in Image Sequences
Reflection Detection in Image SequencesMohamed Abdelaziz Ahmed Francois Pitie Anil KokaramSigmedia,Electronic and Electrical Engineering Department,Trinity College Dublin{/People}AbstractReflections in image sequences consist of several layers superimposed over each other.This phenomenon causes many image processing techniques to fail as they assume the presence of only one layer at each examined site e.g.motion estimation and object recognition.This work presents an automated technique for detecting reflections in image se-quences by analyzing motion trajectories of feature points. It models reflection as regions containing two different lay-ers moving over each other.We present a strong detector based on combining a set of weak detectors.We use novel priors,generate sparse and dense detection maps and our results show high detection rate with rejection to patholog-ical motion and occlusion.1.IntroductionReflections are often the result of superimposing differ-ent layers over each other(see Fig.1,2,4,5).They mainly occur due to photographing objects situated behind a semi reflective medium(e.g.a glass window).As a result the captured image is a mixture between the reflecting surface (background layer)and the reflected image(foreground). When viewed from a moving camera,two different layers moving over each other in different directions are observed. This phenomenon violates many of the existing models for video sequences and hence causes many consumer video applications to fail e.g.slow-motion effects,motion based sports summarization and so on.This calls for the need of an automated technique that detects reflections and assigns a different treatment to them.Detecting reflections requires analyzing data for specific reflection characteristics.However,as reflections can arise by mixing any two images,they come in many shapes and colors(Fig.1,2,4,5).This makes extracting characteris-tics specific to reflections not an easy task.Furthermore, one should be careful when using motion information of re-flections as there is a high probability of motion estimation failure.For these reasons the problem of reflection detec-tion is hard and was not examined before.Reflection can be detected by examining the possibility of decomposing an image into two different layers.Lots of work exist on separating mixtures of semi-transparent lay-ers[17,11,12,7,4,1,13,3,2].Nevertheless,most of the still image techniques[11,4,1,3,2]require two mixtures of the same layers under two different mixing conditions while video techniques[17,12,13]assume a simple rigid motion for the background[17,13]or a repetitive one[12].These assumptions are hardly valid for reflections on mov-ing image sequences.This paper presents an automated technique for detect-ing reflections in image sequences.It is based on analyzing spatio-temporal profiles of feature point trajectories.This work focuses on analyzing three main features of reflec-tions:1)the ability of decomposing an image into two in-dependent layers2)image sharpness3)the temporal be-havior of image patches.Several weak detectors based on analyzing these features through different measures are pro-posed.Afinal strong detector is generated by combining the weak detectors.The problem is formulated within a Bayesian framework and priors are defined in a way to re-ject false alarms.Several sequences are processed and re-sults show high detection rate with rejection to complicated motion patterns e.g.blur,occlusion,fast motion.Aspects of novelty in this paper include:1)A technique for decomposing a color still image containing reflection into two images containing the structures of the source lay-ers.We do not claim that this technique could be used to fully remove reflections from videos.What we claim is that the extracted layers can be useful for reflection detection since on a block basis,reflection is reduced.This technique can not compete with state of the art separation techniques.However we use this technique because it works on single frames and thus does not require motion,which is not the case with any existing separation technique.2)Diagnos-tic tools for reflection detection based on analyzing feature points trajectories3)A scheme for combining weak de-tectors in one strong reflection detector using Adaboost4) Incorporating priors which reject spatially and temporally impulsive detections5)The generation of dense detection maps from sparse detections and using thresholding by hys-1Figure1.Examples of different reflections(shown in green).Reflection is the result of superimposing different layers over each other.As a result they have a wide range of colors and shapes.teresis to avoid selecting particular thresholds for the systemparameters6)Using the generated maps to perform betterframe rate conversion in regions of reflection.Frame rateconversion is a computer vision application that is widelyused in the post-production industry.In the next section wepresent a review on the relevant techniques for layer separa-tion.In section3we propose our layer separation technique.We then go to propose our Bayesian framework followed bythe results section.2.Review on Layer Separation TechniquesA mixed image M is modeled as a linear combinationbetween the source layers L1and L2according to the mix-ing parameters(a,b)as follows.M=aL1+bL2(1)Layer separation techniques attempt to decompose reflec-tion M into two independent layers.They do so by ex-changing information between the source layers(L1andL2)until their mutual independence is maximized.Thishowever requires the presence of two mixtures of the samelayers under two different mixing proportions[11,4,1,3,2].Different separation techniques use different forms ofexpressing the mutual layer independence.Current formsused include minimizing the number of corners in the sep-arated layers[7]and minimizing the grayscale correlationbetween the layers[11].Other techniques[17,12,13]avoid the requirement ofhaving two mixtures of the same layers by using tempo-ral information.However they often require either a staticbackground throughout the whole image sequence[17],constraint both layers to be of non-varying content throughtime[13],or require the presence of repetitive dynamic mo-tion in one of the layers[12].Yair Weiss[17]developed atechnique which estimates the intrinsic image(static back-ground)of an image sequence.Gradients of the intrinsiclayer are calculated by temporallyfiltering the gradientfieldof the sequence.Filtering is performed in horizontal andvertical directions and the generated gradients are used toreconstruct the rest of the background image.yer Separation Using Color IndependenceThe source layers of a reflection M are usually color in-dependent.We noticed that the red and blue channels ofM are the two most uncorrelated RGB channels.Each ofthese channels is usually dominated by one layer.Hence thesource layers(L1,L2)can be estimated by exchanging in-formation between the red and blue channels till the mutualindependence between both channels is r-mation exchange for layer separation wasfirst introducedby Sarel et.al[12]and it is reformulated for our problem asfollowsL1=M R−αM BL2=M B−βM R(2)Here(M R,M B)are the red and blue channels of themixture M while(α,β)are separation parameters to becalculated.An exhaustive search for(α,β)is performed.Motivated by Levin et.al.work on layer separation[7],thebest separated layer is selected as the one with the lowestcornerness value.The Harris cornerness operator is usedhere.A minimum texture is imposed on the separated lay-ers by discarding layers with a variance less than T x.For an8-bit image,T x is set to2.The removal of this constraintcan generate empty meaningless layers.The novelty in thislayer separation technique is that unlike previous techniques[11,4,1,3,2],it only requires one image.Fig.2shows separation results generated by the proposedtechnique for different images.Results show that our tech-nique reduces reflections and shadows.Results are only dis-played to illustrate a preprocess step,that is used for one ofour reflection measures and not to illustrate full reflectionremoval.Blocky artifacts are due to processing images in50×50blocks.These artifacts are irrelevant to reflectiondetection.4.Bayesian Inference for Reflection Detection(BIRD)The goal of the algorithm is tofind regions in imagesequences containing reflections.This is achieved by an-(a)(b)(c)(d)(e)(f)Figure 2.Reducing reflections/shadows using the proposed layer separation technique.Color images are the original images with reflec-tions/shadows (shown in green).The uncolored images represent one source layer (calculated by our technique)with reflections/shadows reduced.In (e)reflection still remains apparent however the person in the car is fully removed.alyzing trajectories of feature points.Trajectories are gen-erated using KLT feature point tracker [9,14].Denote P inas the feature point of i th track in frame n and F inas the 50×50image patch centered on P in .Trajectories are ana-lyzed by examining all feature points along tracks of length more than 4frames.For each point,analysis are carriedover the three image patches (F i n −1,F i n ,F in +1).Based onthe analysis outcome,a binary label field l in is assigned toeach F i n .l in is set to 1for reflection and 0otherwise.4.1.Bayesian FrameworkThe system derives an estimate for l in from the posterior P (l |F )(where (i,n)are dropped for clarity).The posterior is factorized in a Bayesian fashion as followsP (l |F )=P (F|l )P (l |l N )(3)The likelihood term P (F|l )consists of 9detectors D 1−D 9each performing different analysis on F and operating at thresholds T 1−9(see Sec.4.5.1).The prior P (l |l N )en-forces various smoothness constraints in space and time toreject spatially and temporally impulsive detections and to generate dense detection masks.Here N denote the spatio-temporal neighborhood of the examined site.yer Separation LikelihoodThis likelihood measures the ability of decomposing animage patch F in into two independent layers.Three detec-tors are proposed.Two of them attempts to perform layer separation before analyzing data while the third measures the possibility of layer separation by measuring the color channels independence.Layer Separation via Color Independence D 1:Our technique (presented in Sec.3)is used to decompose the im-age patch F i n into two layers L 1i n and L 2in .This is applied for every point along every track.Reflection is detected by comparing the temporal behavior of the observed image patches F with the temporal behavior of the extracted lay-ers.Patches containing reflection are defined as ones with higher temporal discontinuity before separation than after separation.Temporal discontinuity is measured using struc-ture similarity index SSIM[16]as followsD1i n=max(SS(G i n,G i n−1),SS(G i n,G i n+1))−max(SS(L i n,L i n−1),SS(L i n,L i n+1))SS(L i n,L i n−1)=max(SS(L1i n,L1i n−1),SS(L2i n,L2i n−1))) SS(L i n,L i n+1)=max(SS(L1i n,L1i n+1),SS(L2i n,L2i n+1)) Here G=0.1F R+0.7F G+0.2F B where(F R,F G,F B) are the red,green and blue components of F respectively. SS(G i n,G i n−1)denotes the structure similarity between the two images F i n and F i n−1.We only compare the structures of(G i n,G i n−1)by turning off the luminance component of SSIM[16].SS(.,.)returns an a value between0−1where 1denotes identical similarity.Reflection is detected if D1i n is less than T1.Intrinsic Layer Extraction D2:Let INTR i denote the intrinsic(reflectance)image extracted by processing the 50×50i th track using Yair technique[17].In case of re-flection the structure similarity between the observed mix-ture F i n and INTR i should be low.Therefore,F i n isflagged as containing reflection if SS(F i n,INTR i)is less than T2.Color Channels Independence D3:This approach measures the Generalized Normalized Cross Correlation (GNGC)[11]between the red and blue channels of the ex-amined patch F i n to infer whether the patch is a mixture between two different layers or not.GNGC takes values between0and1where1denotes perfect match between the red and blue channels(M R and M B respectively).This analysis is applied to every image patch F i n and reflection is detected if GNGC(M R,M B)<T3.4.3.Image Sharpness Likelihood:D4,D5Two approaches for analyzing image sharpness are used. Thefirst,D4,estimates thefirst order derivatives for the examined patch F i n andflags it as containing reflection if the mean of the gradient magnitude within the examined patch is smaller than a threshold T4.The second approach, D5,uses the sharpness metric of Ferzil et.al.[5]andflagsa patch as reflection if its sharpness value is less than T5.4.4.Temporal Discontinuity LikelihoodSIFT Temporal Profile D6:This detectorflags the ex-amined patch F i n as reflection if its SIFT features[8]are undergoing high temporal mismatch.A vector p=[x s g]is assigned to every interest point in F i n.The vector contains the position of the point x=(x,y),scale and dominate ori-entation from the SIFT descriptor,s=(δ,o),and the128 point SIFT descriptor g.Interest points are matched with neighboring frames using[8].F i n isflagged as reflection if the average distance between the matched vectors p is larger than T6.Color Temporal Profile D7:This detectorflags the im-age patch F i n as reflection if its grayscale profile does not change smoothly through time.The temporal change in color is defined as followsD7i n=min( C i n−C i n−1 , C i n−C i n+1 )(4) Here C i n is the mean value for G i n,the grayscale representa-tion of F i n.F i n isflagged as reflection if D7i n>T7.AutoCorrelation Temporal Profile D8:This detector flags the image patch F i n as reflection if its autocorrelation is undergoing large temporal change.The temporal change in the autocorrelation is defined as followsD8i n=min(1NA i n−A i n−1 2,1NA i n−A i n+1 2)(5)A i n is a vector containing the autocorrelation of G i n while N is the number of pels in A i n.F i n isflagged as reflection if D8i n is bigger than T8.Motion Field Divergence D9:D9for the examined patch F i n is defined as followsD9i n=DFD( div(d(n)) + div(d(n+1)) )/2(6) DFD and div(d(n))are the Displaced Frame Difference and Motion Field Divergence for F i n.d(n)is the2D motion vector calculated using block matching.DFD is set to the minimum of the forward and backward DFDs.div(d(n)) is set to the minimum of the forward and backward di-vergence.The divergence is averaged over blocks of two frames to reduce the effect of possible motion blur gener-ated by unsteady camera motion.F i n isflagged as reflection if D9>T9.4.5.Solving for l in4.5.1Maximum Likelihood(ML)SolutionThe likelihood is factorized as followsP(F|l)=P(l|D1)P(l|D2−8)P(l|D9)(7)Thefirst and last terms are solved using D1<T1and D9>T9respectively.D2−8are used to form one strong detector D s and P(l|D2−8)is solved by D s>T s.We found that not including(D1,D9)in D s generates better de-tection results than when included.Feature analysis of each detector are averaged over a block of three frames to gen-erate temporally consistent detections.T9isfixed to10in all experiments.In Sec.4.5.2we avoid selecting particular thresholds for(T1,T s)by imposing spatial and temporal priors on the generated maps.Calculating D s:The strong detector D s is expressed as a linear combination of weak detectors operating at different thresholds T as followsP(l|D2−8)=Mk=1W(V(k),T)P(D V(k)|T)(8)False Alarm RateC o r r e c tD e t e c t i o n R a t eFigure 3.ROC for D 1−9and D s .The Adaboost detector D s out-performs all other techniques and D 1is the second best in the range of false alarms <0.1.Here M is the number of weak detectors (fixed to 20)used in forming D s and V (k )is a function which returns a value between 2-8to indicate which detectors from D 2−8are used.k indexes the weak detectors in order of their impor-tance as defined by the weights W .W and T are learned through Adaboost [15](see Tab.1).Our training set consist of 89393images of size 50×50pels.Reflection is modeled in 35966images each being a synthetic mixture between two different images.Fig.3shows the the Receiver Operating Characteristic (ROC)of applying D 1−9and D s on the training samples.D s outperforms all the other detectors due to its higher cor-rect detection rate and lower false alarms.D 6D 8D 5D 3D 2D 4D 7W 1.310.960.480.520.330.320.26T0.296.76e −60.040.950.6172.17Table 1.Weights W and operating thresholds T for the best seven detectors selected by Adaboost.4.5.2Successive Refinement for Maximum A-Posteriori (MAP)The prior P (l |l N )of Eq.3imposes spatial and temporal smoothness on detection masks.We create a MAP estimate by refining the sparse maps from the previous ML steps.We first refine the labeling of all the existing feature points P in each image and then use the overlapping 50×50patches around the refined labeled points as a dense pixel map.ML Refinement:First we reject false detections from ML which are spatially inconsistent.Every feature point l =1is considered and the sum of the geodesic distance from that site to the two closest neighbors which are labeledl =1is measured.When that distance is more than 0.005then that decision is rejected i.e.we set l =0.Geodesic distances allow the nature of the image material between point to be taken in to account more effectively and have been in use for some time now [10].To reduce the compu-tational load of this step,we downsample the image mas-sively by 50in both directions.This retains gross image topology only.Spatio-Temporal Dilation:Labels are extended in space and time to other feature points along their trajecto-ries.If l in =1,all feature points lying along the track i are set to l =1.In addition,l is extended to all image patches (F n )overlapping spatially with the examined patch.This generates a denser representation of the detection masks.We call this step ML-Denser.Hysteresis:We can avoid selecting particular thresholds [T 1,T s ]for BIRD by applying Hysteresis using a set of dif-ferent thresholds.Let T H =[−0.4,5]and T L =[0,3]de-note a high and low configuration for [T 1,T s ].Detection starts by examining ML-Denser at high thresholds.High thresholds generate detected points P h with high confi-dence.Points within a small geodesic distance (<D geo )and small euclidean distance (<D euc )to each other are grouped together.Here we use (D geo ,D euc )=(0.0025,4)and resize the examined frames as mentioned previously.The centroids of each group is then calculated.Thresholds are lowered and a new detection point is added to an exist-ing group if it is within D geo and D euc to the centroid of this group.This is the hysteresis idea.If however the examined point has a large euclidean distance (>D euc )but a small geodesic distance (<D geo )to the centroid of all existing groups,a new group is formed.Points at which distances >D geo and >D euc are regarded as outliers and discarded.Group centroids are updated and the whole process is re-peated iteratively till the examined threshold reaches T L .The detection map generated at T L is made more dense by performing Spatio-Temporal Dilation above.Spatio-Temporal ‘Opening’:False alarms of the previ-ous step are removed by propagating the patches detected in the first frame to the rest of the sequence along the fea-ture point trajectories.A detection sample at fame n is kept if it agrees with the propagated detections from the previous frame.Correct detections missed from this step are recovered by running Spatio-Temporal Dilation on the ‘temporally eroded’solution.This does mean that trajecto-ries which do not start in the first frame are not likely to be considered,however this does not affect the performance in our real examples shown here.The selection of an optimal frame from which to perform this opening operation is the subject of future work.=Figure 4.From Top:ML (calculated at (T 1,T s )=(−0.13,3.15)),Hysteresis and Spatio-Temporal ‘Opening’for three consecutive frames from the SelimH sequence.Reflection is shown in red and detected reflection using our technique is shown in green.Spatio-Temporal ‘Opening’rejects false alarms generated by ML and by Hysteresis (shown in yellow and blue respectively).5.Results5.1.Reflection Detection15sequences containing 932frames of size 576×720are processed with BIRD.Full sequences with reflection de-tection can be found in /Misc/CVPR2011.Fig.4compares the ML,Hysteresis and Spatio-Temporal ‘Opening’for three consecutive frames from the SelimH se-quence.This sequence contains occlusion,motion blur and strong edges in the reflection (shown in red).The ML so-lution (first line)generates good sparse reflection detection (shown in green),however it generates some errors (shown in yellow).Hysteresis rejects these errors and generates dense masks with some false alarm (shown in blue).These false alarms are rejected by Spatio-Temporal ‘Opening’.Fig.5shows the result of processing four sequences us-ing BIRD.In the first two sequences,BIRD detected regions of reflections correctly and discarded regions of occlusion (shown in purple)and motion blur (shown in blue).In Girl-Ref most of the sequence is correctly classified as reflection.In SelimK1the portrait on the right is correctly classified as containing reflection even in the presence of motion blur (shown in blue).Nevertheless,BIRD failed in detecting the reflection on the left portrait as it does not contain strong distinctive feature points.Fig.6shows the ROC plot for 50frames from SelimH .Here we compare our technique BIRD against DFD and Im-age Sharpness[5].DFD,flags a region as reflection if it has high displaced frame difference.Image Sharpness flags a region as reflection if it has low sharpness.Frames are pro-cessed on 50×50blocks.Ground truth reflection masks are generated manually and detection rates are calculated on pel basis.The ROC shows that BIRD outperforms the other techniques by achieving a very high correct detection rate of 0.9for a false detection rate of 0.1.This is a major improvement over a correct detection rate of 0.2and 0.1for DFD and Sharpness respectively.5.2.Frame Rate Conversion:An applicationOne application for reflection detection is improving frame rate conversion in regions of reflection.Frame rate conversion is the process of creating new frames from ex-isting ones.This is done by using motion vectors to inter-polate objects in the new frames.This process usually fails in regions of reflection due to motion estimation failure.Fig.7illustrates the generation of a slow motion effect for the person’s leg in GirlRef (see Fig.5,third line).This is done by doubling the frame rate using the Foundry’s Kro-nos plugin [6].Kronos has an input which defines the den-sity of the motion vector field.The larger the density theFigure 5.Detection results of BIRD (shown in green)on,From top:BuilOnWind [10,35,49],PHouse 9-11,GirlRef [45,55,65],SelimK132-35.Reflections are shown in red.Good detections are generated despite occlusion (shown in purple)and motion blur (shown in blue).For GirlRef we replace Hysteresis and Spatio-Temporal ‘Opening’with a manual parameter configuration of (T 1,T s )=(−0.01,3.15)followed by a Spatio-Temporal Dilation step.This setting generates good detections for all examined sequences with static backgrounds.more detailed the vector and hence the better the interpo-lation.However,using highly detailed vectors generate ar-tifacts in regions of reflections as shown in Fig.7(second line).We reduce these artifacts by lowering the motion vec-tor density in regions of reflection indicated by BIRD (see Fig.7,third line).Image sequence results and more exam-ples are available in /Misc/CVPR2011.6.ConclusionThis paper has presented a technique for detecting reflec-tions in image sequences.This problem was not addressed before.Our technique performs several analysis on feature point trajectories and generates a strong detector by com-bining these analysis.Results show major improvement over techniques which measure image sharpness and tem-poral discontinuity.Our technique generates high correct detection rate with rejection to regions containing compli-cated motion eg.motion blur,occlusion.The technique was fully automated in generating most results.As an ap-plication,we showed how the generated detections can be used to improve frame rate conversion.A limiting factor of our technique is that it requires source layers with strong distinctive feature points.This could lead to incomplete de-tections.Acknowledgment:This work is funded by the Irish Re-serach Council for Science,Engineering and TechnologyFigure 7.Slow motion effect for the person’s leg of GirlRef (see Fig:5third line).Top:Original frames 59-61;Middle:generated frames using the Foundry’s plugin Kronos [6]with one motion vector calculated for every 4pels;Bottom;with one motion vector calculated for every 64pels in regions of reflection.False Alarm RateC o r r e c tD e t e c t i o n R a t eFigure 6.ROC plots for our technique BIRD,DFD and Sharpness for SelimH .Our technique BIRD outperforms DFD and Sharp-ness with a massive increase in the Correct Detection Rate.(IRCSET)and Science Foundation Ireland (SFI).References[1] A.M.Bronstein,M.M.Bronstein,M.Zibulevsky,and Y .Y .Zeevi.Sparse ICA for blind separation of transmitted and reflected images.International Journal of Imaging Systems and Technology ,15(1):84–91,2005.1,2[2]N.Chen and P.De Leon.Blind image separation throughkurtosis maximization.In Asilomar Conference on Signals,Systems and Computers ,volume 1,pages 318–322,2001.1,2[3]K.Diamantaras and T.Papadimitriou.Blind separation ofreflections using the image mixtures ratio.In ICIP ,pages 1034–1037,2005.1,2[4]H.Farid and E.Adelson.Separating reflections from imagesby use of independent components analysis.Journal of the Optical Society of America ,16(9):2136–2145,1999.1,2[5]R.Ferzli and L.J.Karam.A no-reference objective imagesharpness metric based on the notion of just noticeable blur (jnb).IEEE Trans.on Img.Proc.(TIPS),18(4):717–728,2009.4,6[6]T.Foundry.Nuke,furnace .6,8[7] A.Levin,A.Zomet,and Y .Weiss.Separating reflectionsfrom a single image using local features.In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),pages 306–313,2004.1,2[8] D.G.Lowe.Distinctive image features from scale-invariantput.Vision ,60(2):91–110,2004.4[9] B.D.Lucas and T.Kanade.An iterative image registra-tion technique with an application to stereo vision (darpa).In DARPA Image Understanding Workshop ,pages 121–130,1981.3[10] D.Ring and F.Pitie.Feature-assisted sparse to dense motionestimation using geodesic distances.In International Ma-chine Vision and Image Processing Conference ,pages 7–12,2009.5[11] B.Sarel and M.Irani.Separating transparent layers throughlayer information exchange.In European Conference on Computer Vision (ECCV),pages 328–341,2004.1,2,4[12] B.Sarel and M.Irani.Separating transparent layers of repet-itive dynamic behaviors.In ICCV ,pages 26–32,2005.1,2[13]R.Szeliski,S.Avidan,and yer extrac-tion from multiple images containing reflections and trans-parency.In CVPR ,volume 1,pages 246–253,2000.1,2[14] C.T.Takeo and T.Kanade.Detection and tracking ofpoint features.Carnegie Mellon University Technical Report CMU-CS-91-132,1991.3[15]P.Viola and M.Jones.Robust real-time object detection.InInternational Journal of Computer Vision ,2001.5[16]Z.Wang,A.Bovik,H.Sheikh,and E.Simoncelli.Imagequality assessment:from error visibility to structural simi-larity.TIPS ,13(4):600–612,April 2004.4[17]Y .Weiss.Deriving intrinsic images from image sequences.In ICCV ,pages 68–75,2001.1,2,4。
Inhomogeneous Surface Diffusion for Image Filtering
3 Related Di usions
3.1 Curve Di usions
Recent work on geometric di usions in image ltering has focused on di using the isotope curve, I (x; y ) = constant. The motivation has been to smooth the tangent to the isotope, thereby preserving it. One di usion model that has been suggested 2] reduces in the limit (of scale space) to a di usion rate 2 I I 2 ? 2Ixy Ix Iy + Iyy Ix @I = ^ = xx y (11) 3 @t
2 Theory of Image Surface Mean Curvature Di usion
2.1 The Surface Representation
By introducing the third coordinate of space z and assigning the image intensity to this coordinate, the image can be characterized by
1 Introduction
Methods of geometry-driven di usion for image ltering are often based on analogy to the di usion of heat. Perona and Malik 1] argued that the conduction coe cient in the heat equation, which is space invariant in the classic theory, should be allowed to vary spatially. They provided a mathematical foundation for the concept of selective smoothing. More recent work on inhomogeneous di usion in image ltering has focused on di using the isotope curve, and is based on a level function or curve di usion setting 2{6]. We present a less familiar approach to the development of inhomogeneous di usion (ID) algorithms in which the image is regarded as a surface in three-space. The approach we describe is also related to the Perona and Malik anisotropic di usion model 1], but rather than viewing the model 1
National Optical Astronomy Observatories++
A Reference Guide to the IRAF/DAOPHOT PackageLindsey E.DavisIRAF Programming GroupNational Optical Astronomy Observatories††Tucson,Arizona85726January1994ABSTRACTDAOPHOT is a software package for doing stellar photometry in crowded stellarfields developed by Peter Stetson(1987)of the Dominion Astrophysical Observatory.IRAF/DAOPHOT uses the task structure and algorithms of DAO-PHOT to do crowded-field stellar photometry within the IRAF data reduction and analysis environment.This document briefly describes the principal similarities and differences between DAOPHOT and IRAF/DAOPHOT,the data preparation required to successfully use IRAF/DAOPHOT,how to examine and edit the IRAF/DAOPHOT algorithm parameters,how to run the IRAF/DAOPHOT package tasks interactively,non-interactively,or in the background,and how to examine and perform simple database operations on the output photometryfiles.This document is intended as a reference guide to the details of using and interpreting the results of IRAF/DAOPHOT not a user’s cookbook or a general guide to doing photometry in IRAF.Its goal is to take the user from a fully reduced image of a crowded stellarfield to aperture corrected instrumental mag-nitudes using a small artificial image as a sample data set.First time IRAF/DAOPHOT users should consult A User’s Guide to Stellar Photometry With IRAF,by Phil Massey and Lindsey Davis.Detailed descriptions of the DAOPHOT photometry algorithms can be found in Stetson(1987,1990,1992).††Operated by the Association of Universities for Research in Astronomy,Inc.under cooperative agreement with the National Science Foundation.Contents1.Introduction (1)2.DAOPHOT and IRAF/DAOPHOT (1)3.Preparing Data for DAOPHOT (3)4.Some IRAF Basics for New IRAF and DAOPHOT Users (4)4.1.Pre-loaded Packages (4)4.1.1.The DATAIO Package (5)4.1.2.The PLOT Package (5)4.1.3.The IMAGES Package (5)4.1.4.The TV Package (5)4.2.Other Useful Packages and Tasks (5)4.3.Image Types,Image Directories,and Image Headers (5)4.4.The Image Display and Image Cursor (6)4.5.The Graphics Device and Graphics Cursor (7)5.Some DAOPHOT Basics for New DAOPHOT Users (8)5.1.Loading the DAOPHOT Package (8)5.2.Loading the TABLES Package (8)5.3.Running the Test Script (8)5.4.On-line Help (9)5.5.Editing the Package Parameters (10)5.6.Editing the Task Parameters (11)5.7.Input and Output Image Names (11)5.8.Input and Output File Names (12)5.9.Algorithm Parameter Sets (12)5.10.Interactive Mode and Non-Interactive Mode (14)5.11.Image and Graphics Cursor Input (14)5.12.Graphics Output (15)5.13.Verify,Update,and Verbose (15)5.14.Background Jobs (15)5.15.Timing Tests (16)6.Doing Photometry with DAOPHOT (16)6.1.The Test Image (16)6.2.Typical Analysis Sequence (17)6.3.Creating and Organizing an Analysis Directory (19)6.4.Reading the Data (19)6.5.Editing the Image Headers (19)6.5.1.The Minimum Image Header Requirements (19)6.5.2.The Effective Gain and Readout Noise (19)6.5.3.The Maximum Good Data Value (21)6.5.4.The Effective Exposure Time (22)6.5.5.The Airmass,Filter Id,and Time of Observation (22)6.5.6.Batch Header Editing (24)6.6.Editing,Checking,and Storing the Algorithm Parameters (24)6.6.1.The Critical Algorithm Parameters (24)6.6.2.Editing the Algorithm Parameters Interactively with Daoedit (24)6.6.2.1.The Data Dependent Algorithm Parameters (25)6.6.2.2.The Centering Algorithm Parameters (28)6.6.2.3.The Sky Fitting Algorithm Parameters (29)6.6.2.4.The Aperture Photometry Parameters (29)6.6.2.5.The Psf Modeling and Fitting Parameters (30)6.6.2.6.Setting the Algorithm Parameters Graphically (31)6.6.3.Checking the Algorithm Parameters with Daoedit (31)6.6.4.Storing the Algorithm Parameter Values with Setimpars (32)6.6.5.Restoring the Algorithm Parameter Values with Setimpars (32)6.7.Creating a Star List (32)6.7.1.The Daofind Task (33)6.7.1.1.The Daofind Algorithm (33)6.7.1.2.The Daofind Algorithm Parameters (33)6.7.1.3.Running Daofind Non-Interactively (34)6.7.1.4.Running Daofind Interactively (34)6.7.1.5.The Daofind Output (36)6.7.1.6.Examining the Daofind Output (37)6.7.2.Rgcursor and Rimcursor (38)er Program (39)6.7.4.Modifying an Existing Coordinate List (39)6.8.Initializing the Photometry with Phot (39)6.8.1.The Phot Algorithm (39)6.8.2.The Phot Algorithm Parameters (40)6.8.3.Running Phot Non-interactively (40)6.8.4.Running Phot Interactively (42)6.8.5.The Phot Output (43)6.8.6.Examining the Results of Phot (44)6.9.Creating a Psf Star List with Pstselect (44)6.9.1.The Pstselect Algorithm (45)6.9.2.The Pstselect Algorithm Parameters (45)6.9.3.How Many Psf Stars Should Be Selected? (46)6.9.4.Running Pstselect Non-interactively (47)6.9.5.Running Pstselect Interactively (47)6.9.6.The Pstselect Output (48)6.9.7.Examining and/or Editing the Results of Pstselect (48)puting the Psf Model with Psf (49)6.10.1.The Psf Algorithm (49)6.10.2.Choosing the Appropriate Analytic Function (50)6.10.3.The Analytic Psf Model (50)6.10.4.The Empirical Constant Psf Model (51)6.10.5.The Empirical Variable Psf Model (51)6.10.6.Rejecting Bad Data from the Psf Model (51)6.10.7.The Model Psf Psfrad and Fitrad (52)6.10.8.Modeling the Psf Interactively Without a Psf Star List (52)6.10.9.Fitting the Psf Model Interactively Using an Initial Psf Star List (54)6.10.10.Fitting the Psf Model Interactively Without an Image Display (55)6.10.11.Fitting the Psf Model Non-interactively (56)6.10.12.The Output of Psf (57)6.10.13.Checking the Psf Model (59)6.10.14.Removing Bad Stars from the Psf Model (62)6.10.15.Adding New Stars to a Psf Star Group (62)6.10.16.Refitting the Psf Model With the New Psf Star Groups (62)puting the Final Psf Model (63)6.10.18.Visualizing the Psf Model with the Seepsf Task (63)6.10.19.Problems Computing the Psf Model (64)6.11.Doing Psf Fitting Photometry with Peak,Nstar,or Allstar (65)6.11.1.Fitting Single Stars with Peak (65)6.11.1.1.The Peak Algorithm (65)6.11.1.2.Running Peak (65)6.11.1.3.The Peak Output (66)6.11.2.Fitting Stars with Group,Grpselect,Nstar and Substar (67)6.11.2.1.The Group and Nstar Algorithms (67)6.11.2.2.Running Group,Grpselect,and Nstar (68)6.11.2.3.The Nstar Output (70)6.11.3.Fitting Stars With Allstar (71)6.11.3.1.The Allstar Algorithm (71)6.11.3.2.Running Allstar (72)6.11.3.3.The Allstar Output (73)6.12.Examining the Output Photometry Files (73)6.13.Problems with the Photometry (74)6.14.Detecting Stars Missed By Daofind (75)6.15.Initializing the Missing Star Photometry with Phot (75)6.16.Merging Photometry Files with Pfmerge (76)6.17.Refitting the Stars with Allstar (76)6.18.Examining the Subtracted Image (76)puting an Aperture Correction (76)7.References (77)8.Appendices (77)8.1.The Instrumental Magnitude Scale (77)8.2.The Analytic Psf Models (77)8.3.The Error Model (78)8.4.The Radial Weighting Function (78)8.5.Total Weights (78)8.6.Bad Data Detection (78)8.7.Stellar Mergers (79)8.8.Faint Stars (79)A Reference Guide to the IRAF/DAOPHOT PackageLindsey E.DavisIRAF Programming GroupNational Optical Astronomy Observatories††Tucson,Arizona85726January19941.IntroductionDAOPHOT is a software package for doing stellar photometry in crowdedfields developed by Peter Stetson of the DAO(1987,1990,1992).The IRAF/DAOPHOT package uses the task structure and algorithms of DAOPHOT to do crowdedfield photometry within the IRAF data reduction and analysis environment.Input to IRAF/DAOPHOT consists of an IRAF imagefile,numerous parameters control-ling the analysis algorithms and,optionally,graphics cursor and/or image display cursor input. IRAF/DAOPHOT produces output photometryfiles in either text format or STSDAS binary table format.Some IRAF/DAOPHOT tasks also produce image output and graphics output in the form of plot metacodefiles.Separate tasks are provided for examining,editing,storing,and recalling the analysis parameters,creating and editing star lists,computing accurate centers,sky values and initial magnitudes for the stars in the list,computing the point-spread function,grouping the stars into physical associations,fitting the stars either singly or in groups,subtracting thefitted stars from the original image,and adding artificial test stars to the original image.A set of tools are also provided for examining and editing the output photometryfiles.2.DAOPHOT and IRAF/DAOPHOTThe principal similarities and differences between DAOPHOT and IRAF/DAOPHOT are summarized below.[1]The structure of IRAF/DAOPHOT is very similar to the structure of DAOPHOT.All theDAOPHOT photometry tasks and many of the utilities tasks are present inIRAF/DAOPHOT and in many cases the DAOPHOT task names have been preserved.A listing of the DAOPHOT photometry tasks and their closest IRAF/DAOPHOT equivalents is shown below.DAOPHOT IRAF/DAOPHOTTASK EQUIVALENTadd*addstarallstar allstarattach N/Aappend pfmerge,pconcat††Operated by the Association of Universities for Research in Astronomy,Inc.under cooperative agreement with the National Science Foundation.find daofindgroup groupmonitor daophot.verbose=yesnomonitor daophot.verbose=nonstar nstaroffset pcalcoptions daoeditpeak peakphotometry photpick pstselectpsf psfselect grpselectsort psort,prenumbersub*substar[2]Some DAOPHOT utilities tasks are missing from IRAF/DAOPHOT.The DAOPHOTtasks dump,exit,fudge,help,list,and sky have been replaced with general IRAF tasks, or with IRAF system facilities that perform the equivalent function.The missing DAO-PHOT utilities tasks and their IRAF equivalents are shown below.DAOPHOT IRAF/DAOPHOTTASK EQUIVALENTdump listpixels,imexamineexit byefudge imreplace,fixpix,imedithelp help daophotlist imheadersky imstatistics,phistogram,imexamine[3]The IRAF/DAOPHOT default algorithms are the DAOPHOT II algorithms(Stetson1992).[4]Users have more choice of and control over the algorithms in IRAF/DAOPHOT than theydo in DAOPHOT.For example the IRAF/DAOPHOT aperture photometry task photoffers several skyfitting algorithms besides the default"mode"algorithm,and full control over the skyfitting algorithm parameters.[5]The algorithm parameters in IRAF/DAOPHOT are grouped by function into six parametersets or psets rather than three as in DAOPHOT.The six IRAF/DAOPHOT parameter sets with their DAOPHOT equivalents in brackets are:1)datapars,the data definition parame-ters(daophot.opt),2)findpars,the detection algorithm parameters(daophot.opt),3)cen-terpars,the aperture photometry centering algorithm parameters(no equivalent),4)fitskypars,the aperture photometry skyfitting parameters(photo.opt),5)photpars,the aperture photometry parameters(photo.opt),6)daopars,the IRAF/DAOPHOT psffitting parameters(daophot.opt,allstar.opt).[6]The IRAF/DAOPHOT algorithm parameter sets unlike the DAOPHOT parameter sets canbe interactively examined,edited and saved with the daoedit task using the image display and radial profile plots.[7]The IRAF/DAOPHOT algorithm parameter sets unlike the DAOPHOT parameter sets canbe saved and restored as a function of image using the setimpars task.[8]Memory allocation in IRAF/DAOPHOT is dynamic not static as in DAOPHOT.IRAF/DAOPHOT allocates and frees memory as required at run-time subject to the physi-cal memory and swap space limitations of the host computer.[9]The IRAF/DAOPHOT point-spread function look-up table is stored in an IRAF image notan ASCII table as in DAOPHOT.[10]Unlike DAOPHOT,the IRAF/DAOPHOT tasks daofind,phot,pstselect and psf can berun interactively using the image display and graphics window or non-interactively.Display and graphics capabilities were deliberately omitted from DAOPHOT to minimize portability problems.[11]The IRAF/DAOPHOT output photometryfiles can be written in either text format as inDAOPHOT or STSDAS binary table format.[12]Unlike DAOPHOT,fields or columns in both IRAF/DAOPHOT text and STSDAS binarytable photometryfiles are identified by name and have an associated units and formatspecifier.The IRAF/DAOPHOT photometryfile input routines search for column names, for example"GROUP,ID,XCENTER,YCENTER,MAG,MSKY"as appropriate but areindependent of their placement in the inputfile.[13]Several general purpose IRAF/DAOPHOT tasks are available for performing operations onthefinal output photometry catalogs.In addition to pcalc,pconcat,pfmerge,prenumber, and psort which are also available in DAOPHOT,there are three photometryfile editing tasks which have no analog in DAOPHOT pdump,pexamine,and pselect.All thesetasks work on IRAF/DAOPHOT output textfiles or STSDAS binary tables.AnIRAF/DAOPHOT task is supplied for converting output textfiles to STSDAS binarytables so as to make use of the even more general STSDAS tables manipulation tools in the TABLES package.[14]The IRAF/DAOPHOT outputfiles are self-documenting.All the information required tocomprehend the history of or decode the output photometryfile is in thefile itself,includ-ing the IRAF version number,host computer,date,time,and names of all the input and outputfiles and the values of all the parameters.For the remainder of this document IRAF/DAOPHOT will be referred to as DAOPHOT.3.Preparing Data for DAOPHOT[1]DAOPHOT assumes that the images to be analyzed exist on disk in IRAF image format.DAOPHOT can read and write old IRAF format".imh"images and ST IRAF format".hhh"images.When the IRAF FITS kernel becomes available DAOPHOT will be able to read FITS images on disk as well.QPOE IRAF format".qp"images must be rasterized before they can be input to DAOPHOT.[2]All internal DAOPHOT calculations are done in real precision.The pixel type of theimage data on disk may be any of the following data types:short integer,unsigned short integer,integer,long integer,real or ers should realize that the extra precision in images of type double will not be used by DAOPHOT.[3]The instrumental signature must be removed from the input images prior to running DAO-PHOT.All CCD images should be overscan corrected,bias corrected,dark currentcorrected andflat-fiers should be aware of the IRAF CCDRED package forreducing CCD data.[4]DAOPHOT assumes that the input pixel data is linear.If the data is non-linear over alarge fraction of its total dynamic range,the data must be linearized before running DAO-PHOT.[5]Saturated pixels or pixels distinguishable from good data by intensity,do not need to beremoved from the image prior to running DAOPHOT.For example if the data is non-linear only above25000counts,DAOPHOT can be instructed to ignore pixels above25000counts.[6]Extreme-valued pixels should be removed from the images prior to running DAOPHOT.Extreme-valued pixels include those with values at or near thefloating point limits of the host machine and host machine special numbers produced by operations like divide byzero,floating point underflows and overflows,etc.The latter category of extreme-valued pixels should not be produced by IRAF software,but may be produced by user programs including imfort programs.Floating point operations involving such numbers will fre-quently cause arithmetic exception errors,since for efficiency and portability reasons the DAOPHOT package and most IRAF tasks do not test for their presence.The imreplace task in the PROTO package can be used to remove extreme-valued pixels.[7]The background sky value should NOT be subtracted from the image prior to entering theDAOPHOT package.The DAOPHOTfitting routines use an optimal weighting scheme which depends on the readout noise,the gain,and the true counts in the pixels.If themean sky has been subtracted then the counts in the image are not the true counts and the computed weights will be incorrect.For similar reasons users should not attempt to correct their magnitudes for exposure time by dividing their images by the exposure time.[8]Cosmic ray and bad pixel removal programs should be used with caution.If the data andparameter values are set such that the cosmic ray and bad pixel detection and removalalgorithms have difficulty distinguishing between stars and bad pixels or cosmic rays,the peaks of the stars may be clipped,altering the point-spread function and introducing errors into the photometry.[9]DAOPHOT assumes that the local sky background is approximatelyflat in the vicinity ofthe object being measured.This assumption is equivalent to requiring that the local sky region have a unique mode.Variations in the sky background which occur on the same scale as the size of the local sky region will introduce errors into the photometry.[10]The point spread function must be constant or smoothly varying with position over theentire image.This is the fundamental assumption underlying all of DAOPHOT.All stars in the image must be indistinguishable except for position and magnitude.The variable point spread function option is capable of handling second order variability as a function of position in the image.[11]The input images should not have undergone any operations which fundamentally alter theimage point spread function or the image statistics in a non-linear way.For example,non-linear image restoration tasks must not be run on the image to prior to running DAO-PHOT.[12]The gain,readout noise,exposure time,airmass,filter,and observing time should bepresent and correct in the image headers before DAOPHOT reductions are begun.DAO-PHOT tasks can extract this information from the image headers,use it in the computa-tions,and/or store it in the output photometryfiles,greatly simplifying the analysis and subsequent calibration procedures.4.Some IRAF Basics for New IRAF and DAOPHOT Users4.1.Pre-loaded PackagesUnder IRAF versions2.10and later the DATAIO,PLOT,IMAGES,TV and NOAO pack-ages are pre-loaded so that all the tasks directly under them are available when IRAF is started. Each of these packages contains tasks which are useful to DAOPHOT users for various reasons,and each is discussed briefly below.4.1.1.The DATAIO PackageDAOPHOT users should be aware of the DATAIO rfits and wfits tasks which are used to transport data into and out of IRAF.Any input and output images,including point-spread func-tion look-up table images,should normally be archived with wfits.The cardimage reader and writer tasks for archiving textfiles,rcardimage and wcardimage,are also located here.4.1.2.The PLOT PackageVarious general purpose image andfile plotting utilities can be found in the PLOT pack-ages.DAOPHOT users should be aware of the interactive image row and column plotting task implot,the image contour plotting task contour,the image surface plotting task surface,image histogram plotting task phistogram,the image radial profile plotting task pradprof,and the general purpose graphing tool graph.The tasks gkidir and gkiextract are also useful for extracting individual plots from the plot metacodefiles which may be produced by some DAO-PHOT tasks.4.1.3.The IMAGES PackageThe IMAGES package contains a set of general purpose image operators.DAOPHOT users should be aware of the image header examining tasks imheader and hselect,the header editing task hedit,the coordinate and pixel value dumping task listpixels,and the image statis-tics task imstatistics.4.1.4.The TV PackageThe TV package contains tasks which interact with the image display including the all important display task for displaying images,the interactive image examining task imexamine, and the tvmark task for marking objects on the image display.DAOPHOT users should become familiar with all three of these tasks.4.2.Other Useful Packages and TasksThe NPROTO package contains two useful tasks,findgain,for computing the gain and readout noise of a CCD from a pair of biases andflats,andfindthresh for computing the stan-dard deviation of the background in a CCD frame given the readout noise and gain.The ASTU-TIL package contains the setairmass task for computing and/or correcting the airmass given the appropriate input ers might also wish to experiment with the tasks in the artificial data package ARTDATA,and run the resulting images through DAOPHOT.4.3.Image Types,Image Directories,and Image HeadersThe IRAF image environment is controlled by several environment variables.The most important of these for DAOPHOT users are:imtype the disk image format,imdir the default pixel directory,and min_lenuserarea the maximum length of the image header.The values ofthese environment variables can be listed as shown below.cl>show imtypeimhcl>show imdir/data/davis/pixels/cl>show min_lenuserarea24000"imh"is the default image format for most IRAF users,"hhh"the default image format for ST users,and"qp"the photon counting format used for photon counting data.DAOPHOT will work transparently on"imh"and"hhh"images."qp"event lists must be rasterized prior to using DAOPHOT.When IRAF supports FITS images on disk,image format"fits",DAOPHOT will be able to work directly on FITS images as well.IRAF uses the image name extension,e.g. "imh"to automatically sense the image disk format on input.The output disk format is set by: 1)the extension of the output image name if present e.g."imh",2)the cl environment variable imtype if the output image is opened as a new image,e.g.the output of the rfits task,3)the type of the input image if the output image is opened as a new copy of an existing image,e.g. the output of the imcopy task.imdir specifies the default image pixel directory for"imh"formatfiles.The image header files are written to the current directory and the pixelfiles are written to imdir.imdir can be set to an existing directory on a scratch disk,the current directory"HDR$",or the subdirectory pix-els under the current directory"HDR$pixels/".DAOPHOT users should keep both the intrinsic speed of a disk and its network configuration in mind when setting imdir.min_lenuserarea is the size of the image header area reserved in memory when a new or existing image is opened.The current default value of24000corresponds to space for approxi-mately300keywords.If an image on disk has a header larger than this the image header will be truncated when it is read.For most DAOPHOT users the default value is sufficient.How-ever users whose images have large headers or who are creating a point-spread function using more than~70stars should set min_lenuserarea to a larger value,e.g.40000.The following example shows how to change the default pixel directory to HDR$pixels/ and set min_lenuserarea to40000.To avoid redefining these quantities for every session,users should enter the redefinitions into their login.cl or loginuser.clfiles.cl>reset imdir="HDR$pixels/"cl>reset min_lenuserarea=400004.4.The Image Display and Image CursorSeveral DAOPHOT tasks are interactive tasks or have an interactive as well as a non-interactive mode.In interactive mode these tasks must be able to read the image cursor on a displayed image and perform various actions depending on the position of the image cursor and the keystroke command typed.DAOPHOT will work with the display servers Imtool,Saoimage,and Ximtool.DAO-PHOT users should be aware that both Imtool and Ximtool support multiple frame buffers while SAOimage does not.Multiple frame buffers are an important feature for users who wish to compare their original images with the DAOPHOT output images from which all thefitted stars have been ers running DAOPHOT on a remote machine,e.g.one with lots of memory and/or disk space,but displaying on their local machine also need to set the node environment variable to the name of the local machine.cl>show nodeERROR:No such environment variableshow(node)cl>set node=mymachineThe maximum size of the display server frame buffer is defined by the environment vari-able stdimage whose value can be printed as shown below.cl>show stdimageimt512In the previous example the default frames buffers are512pixels square.A user whose images are2K square will want to reset the default frame buffer size as shown below.cl>reset stdimage=imt2048cl>show stdimageimt2048In order for image cursor read-back to function correctly the environment variable stdim-cur must be set to"stdimage"as shown below.cl>show stdimcurstdimageTo check that image cursor read-back is functioning correctly the user should display an image and try to bring up the image display cursor as shown below.cl>display image1cl>=imcurThe image cursor should appear on the image display reading the correct image pixel coordi-nates and ready to accept a keystroke command.Any keystroke will terminate the cursor read.4.5.The Graphics Device and Graphics CursorSome interactive DAOPHOT tasks have graphics submenus which require them to be able to read the graphics cursor on for example a radial profile plot and perform various actions based on the position of the graphics cursor in the plot and the keystroke command issued.The default graphics device is determined by the stdgraph environment variable as shown below. cl>show stdgraphxgtermTo check that graphics cursor read-back is functioning correctly the user should draw a plot and try to bring up the graphics cursor as shown below.cl>contour imagecl>=gcurThe graphics cursor should appear in the graphics window ready to accept a keystroke com-mand.Any keystroke will terminate the cursor read.5.Some DAOPHOT Basics for New DAOPHOT Users5.1.Loading the DAOPHOT PackageThe DAOPHOT package is located in the digital stellar photometry package DIGIPHOT. To load DIGIPHOT and DAOPHOT the user types the package names in sequence as shown below,cl>digiphotdi>daophotafter which the following menu of tasks appears.addstar daotest nstar pexamine psfallstar datapars@pcalc pfmerge psortcenterpars@findpars@pconcat phot pstselect daoedit fitskypars@pconvert photpars@seepsfdaofind group pdump prenumber setimpars daopars@grpselect peak pselect substarTask names with a trailing"@"are parameter set tasks.The remaining tasks are script and/or compiled tasks.After the DAOPHOT package is loaded the user can redisplay the package menu at any time with the command.da>?daophot5.2.Loading the TABLES PackageThe DAOPHOT photometry tasks write their output photometryfiles in either text format (the default)or ST binary tables ers wishing to use the ST binary tables format should acquire and install the ST TABLES external package.Without the TABLES package the DAOPHOT photometry tasks will read and write ST binary tables,but DAOPHOT utilities like psort which call TABLES package tasks will not run on ST binary tables.When DAOPHOT is loaded,it checks to see if the TABLES package is defined,and if so loads it.A warning message is issued if the TABLES package is undefined.The TABLES pack-age tasks can be listed at any time after DAOPHOT is loaded with the following command. da>?tables5.3.Running the Test ScriptThe DAOPHOT package includes a script task daotest which executes each of the core DAOPHOT photometry tasks in turn using a test image stored in FITS format in the DAO-PHOT test directory.Daotest is run as shown below.da>daotestDAOTEST INITIALIZES THE DAOPHOT TASK PARAMETERSTYPE’q’or’Q’TO QUIT,ANY OTHER KEY TO PROCEEDName of the output test image:test。
ACE Exam Guide Photoshop.pdf
Adobe®Photoshop®CS EXAM PREPARATION GUIDEEXAM PREPARATION GUIDEAdobe®Photoshop CSThis guide provides all the information you need to get started in preparing for Adobe Certified Expert (ACE) exams. Passing ACE exams demonstrates your proficiency with Adobe software products and allows you to promote your expertise with customers and employers. As an ACE, you can stand out from your competitors and be noticed. ACE Frequently Asked QuestionsWhat is an ACE? What are the benefits of being certified? What are the exams like? Get all your ACE questions answered in a brief, to-the-point FAQ.Four–Step Check ListLearn about the process for getting certified.Step 1: Choose your certification level.Step 2: Register for your exam(s).Step 3: Prepare for and take your exam(s).Step 4: Read the ACE Agreement and join the Adobe Certified community. Exam Topic Areas and ObjectivesReview the test content—topic areas and objectives—for each product exam.The topic lists will help direct your studies, so please review correctly. Practice ExamsReview sample questions to get an idea of the types of questions that willbe on the exam. Note: Though helpful, these practice exams are not necessarily representative of the difficulty of the questions that you will encounter on theactual exam.Practice exams are available for all of the following Adobe products:• Adobe® Acrobat® Professional• Adobe After Effects®• Adobe GoLive®• Adobe Illustrator® CS• Adobe InDesign® CS• Adobe Photoshop® CS• Adobe Premiere® ProQ. What is an Adobe Certified Expert (ACE)?A. An Adobe Certified Expert (ACE) has earned a certification created specifically for graphic designers, Web designers, video professionals, system integrators, value-added resellers, developers, or business professionals seeking recognition for their expertise with Adobe products. By passing one or more ACE exams, you become eligible to promote yourself to prospective clients as a highly skilled, expert-level user of Adobe software.Q. What is the main benefit of passing ACE exams and becoming Adobe certified?A. Adobe certification is an industry standard in excellence that can be used to demonstrate your product knowledge and expertise, and also serves as a catalyst when finding a job or seeking a promotion. The Adobe Certified Expert (ACE) and Adobe Certified Instructor (ACI) logos and credentials have been created for self-promotional use by individuals who have met all requirements of the Adobe certified Expert or Adobe Certified Instructor programs. Those who have passed the tests may place certification logos on business cards, resumes, Web sites, and other promotional materials.Q. What are ACE Specialist and ACE Master?A. There are three levels of Adobe certification. 1. Single product certification: Recognizes your proficiency in a single Adobe product. To qualify as an ACE, you must pass one product-specific exam. Example: ACE, Adobe InDesign® CS 2. Specialist certification: Recognizes your proficiency in a specific medium: Print, Web or Video. To become certified as a Specialist, you must pass the exams on the required products listed below. Example: ACE Print Specialist (with passing marks on the tests for Adobe InDesign, Adobe Acrobat®, and either Adobe Photoshop® or Adobe Illustrator®)Specialists must pass all required (R) exams and any one elective (E) within anygiven certification track Note: Exam requirements are subject to change.FREQUENTLY ASKED QUESTIONSC E R T I F I EDE X P E R T3. Master certification: Recognizes your skills in terms of how they align with the Adobe product suites. To become certified as a Master, you must pass the exam for each of the products in the suite. Example: ACE, Creative Suite Master (with passing marks on the tests for Adobe Acrobat, Adobe GoLive, Adobe Illustrator, Adobe InDesign, and Adobe Photoshop)Q. What is an Adobe Certified Instructor (ACI)?A. An Adobe Certified Instructor (ACI) is an ACE who provides instruction on Adobe products. ACIs must have an instructor qualification (a teaching credential, passed the CompTIA CTT+ (/certification/ctt/) or equivalent), in addition to passing one or more ACE exams. For more information visit the ACI home page, /support/certification/aci.html Q. How do I prepare for an ACE Exam?A. The keys to preparing for an ACE exam are experience with the product and studying this Exam Preparation Guide (/asn/programs/trainingprovider/aceexams/index.jsp). To access online training, user guides, and many other study materials, visit Adobe’s training resources page (/misc/training.html). The following resources may also help you prepare for your ACE exam:• Adobe product user guides • Adobe Press books • Adobe Authorized Training Centers (find the AATC nearest you through the Adobe Partner Finder online at: /asn/partnerfinder/search_training.jsp) • Adobe online training at • Tutorials and materials from Total Training, at • Other training resources online at /misc/training.html Q. How do I get training?A. Once you know what product you want to get certified in, use the Adobe Partner Finder (/asn/partnerfinder/trainingprovider/index.jsp) database to locate an Adobe Certified Instructor or an Adobe Authorized Training Center in your area that teaches a class on the Adobe product you want to learn. Contact them directly for more information about their classes and registration details.Q. What are the benefits to becoming an ACE?A. As an individual with an ACE, you can:• Differentiate yourself from competitors • Get your resumé noticed • Attract and win new business • Gain recognition from your employer • Get the inside track on Adobe’s latest offerings • Leverage the power of the Adobe brand and resources If you are an employer, use ACE as a benchmark so you can:• Find the right person for the job • Quickly assess candidate skill level • Invest in your most promising employees • Increase productivity and efficiencyFREQUENTLY ASKED QUESTIONSQ. What type of exam are the ACE exams?A. Adobe ACE exams (which are administered by Pearson VUE and Thomson Prometric, independent third-party testing companies) are computer-delivered, closed-book tests consisting of 60 to 90 multiple-choice questions. Each exam takes approximately one to two hours to complete, and results are given to you at the testing center immediately after you complete the test. Q. How do I register for an ACE exam?A.Contact Pearson VUE or Thomson Prometric by phone, on the Web, or in-person:Q. What is the fee for the ACE exam?A . Each exam is US$150 or local currency equivalent.Q. Soon after I passed an ACE exam, the exam was published for a new product version. Do I have to successfully complete the new exam within 90 days?A. No. You may continue to use the ACE logo and materials in accordance with the Adobe Certified Program Guidelines for Logos and Credentials. When the exam is published for the next version of the product you must successfully complete that exam within 90 days to keep your ACE status current and continue using the ACE logo and materials. For example, you successfully completed the Adobe Photoshop 5.0 Product Proficiency Exam; shortly thereafter, the Adobe Photoshop 6.0 Product Proficiency Exam is published. If you choose not take the Adobe Photoshop 6.0 Exam, you may continue to use the ACE logo and materials. When the Photoshop 7.0 Product Proficiency Exam is published you must successfully complete that exam with 90 days in order to continue to use the ACE logo and materials. Otherwise, you would have to cease using the ACE logo and materials; however, you could continue to use text references only to your Adobe Photoshop 5.0 certification.Q. What can I expect from Adobe if I pass the ACE exam?A. As soon as you pass the exam, your name and exam results are given to us by Pearson VUE or Thomson Prometric, our worldwide test administrators. Your exam data is then entered into our database. You will then be sent an ACE Welcome Kit and access to the ACE logo. You are also placed on our certification mailing list to receive special Adobe announcements and information about promotions and events that take place throughout the year.Q. How long does it take to receive my Welcome Kit?A. You can expect your Welcome Kit to arrive four to six weeks after we receive your exam results.FREQUENTLY ASKED QUESTIONSSTEP 1: Choose your certification level There are three levels of certification to become an Adobe® Certified Expert. Choose the one that’s right for you.1. Single product certification: Recognizes your proficiency in a single Adobe product. To qualify as an ACE, you must pass one product-specific exam. Example: ACE, Adobe InDesign® CS 2. Specialist certification: Recognizes your proficiency in a specific medium: print, Web, or video. To become certified as a Specialist, you must pass the exams on the required products listed below. Example: ACE Print Specialist (with passing marks on the tests for Adobe InDesign, Adobe Acrobat®, and either Adobe Photoshop® or Adobe Illustrator®)3. Master certification: Recognizes your skills in terms of how they align with the Adobe product suites. To become certified as a Master, you must pass the exam for each of the products in the suite. Example: ACE, Creative Suite Master (with passing marks on the tests for Adobe Acrobat, Adobe GoLive®, Adobe Illustrator, Adobe InDesign, and Adobe Photoshop)Specialists must pass all required (R) exams and any one elective (E) within anygiven certification track Note: Exam requirements are subject to change.FOUR-STEP CHECK LISTC E R T I F I EDE X P E R TSTEP 2: Register for your exam(s)Adobe ACE exams are administered by Pearson VUE and Thomson Prometric, independent third-party testing companies. The tests are offered at more than five thousand authorized testing centers in many countries.To register for your ACE exam(s), contact Pearson VUE or Thomson Prometric byphone, on the Web, or in person:The ACE exam fee is US$150 worldwide.STEP 3: Prepare for and take your exam(s The keys to preparing for an ACE exam are experience with the product and studying using the Exam Bulletin. To access online training, user guides, and many other study materials, visit Adobe’s training resources page.The following resources may also help you prepare for your ACE exam:• Adobe product user guides • Adobe Press books • Adobe Authorized Training Centers (find the AATC nearest you through the online Adobe Partner Finder)• Adobe online training from Element K ()• Tutorials and materials from Total Training ()• Other online training resources ACE exams are computer-delivered, closed-book tests consisting of 60 to 90 multiple-choice questions. Each exam takes one to two hours to complete, and results are given to you at the testing center immediately after you finish.STEP 4: Sign the ACE Agreement and join the Adobe Certified community Review the ACE Agreement prior to taking the exam. You will be asked to agree to the ACE terms and conditions at the time of the exam.Once you are ACE certified, visit the Adobe Certified community where you can:• Verify exam results • Get a copy of the ACE Agreement • Download your Adobe Certified logo • Find out about recertification • Connect with other Adobe certificants • Update your profile N ote: Information about the Adobe Certified Community is also printed on your score report .Recertify when necessary Once obtained, your ACE certification is valid until 90 days after the exam version of your certification is retired. Adobe will e-mail you a reminder when your certification is due for renewal and will let you know when you need to take another ACE exam.FOUR-STEPCHECK LISTEXAM TOPIC AREAS AND OBJECTIVES Adobe®Photoshop CSTest Content: Topic Areas and ObjectivesFollowing is a detailed outline of the information covered on the exam.1. Using the work area• Configure, save, and load workspaces.• Create a layer comp by using Layer Comps palette.• Manage libraries by using the Preset Manager.• Describe the functionality provided from the status bar and the tool options bar.2. Importing, exporting, and saving• Import and manipulate files by using the Place command.• Import RAW files from a digital camera.• Given a file format, explain when you would use that format in Photoshop.(file formats include: PSD, PDF, TIFF, EPS, JPEG, GIF, PNG)• Explain when to rasterize and when not to rasterize vector data.• Describe the functionality provided by the File Browser.3. Working with selections• Given a scenario, choose the correct tool to create a selection.• Modify selections by using a tool or command.• Explain how feathering affects selections.• Explain how to work with Alpha channels to save or modify a selection.• Explain how to create and modify a temporary mask by using the Quick Mask command.4. Creating and using layers• Use the appropriate tools and commands to create and manage layers.• Edit layers by using the editing, vector, and painting tools.• Use the appropriate tools and commands to create and modify clipping masks, vector masks, layer masks, and layer sets.• Apply blending modes to layers and layer sets.• Create layer styles.• Given a mask, describe the functionality and when you would use the mask.(masks include vector and raster)5. Using channels• Given a channel, describe the channel, and explain when you would use it.(channels include: Alpha, color, and spot)• Explain how to create, manage, and use channels.6. Managing color • Discuss the color management workflow process that is used in Adobe Photoshop. (topics include: ICC profiles, profiling devices, color management engine, color numbers)• Describe the difference between assigning and converting to ICC profiles.• Configure color settings by using the Color Settings dialog box.• Explain the purpose of and how to use the Proof Setup command.• Discuss the relationship between color gamut and rendering intents.7. Adjusting images • Analyze a histogram and explain changes that should be made to expand the dynamic range of an image.• Explain how to adjust the tonal range of an image by using an adjustment layer.• Explain how to adjust the tonal range of an image using options from the Image > Adjustments menu.• Given an option on the Auto Color Correction Options dialog box, explain the purpose of the option.• Discuss issues associated with changing the size and resolution of images.• Given an option on the Shadow/Highlight dialog box, explain the purpose of the option8. Drawing and editing • Create shape layers and paths by using the Pen and Shape tools.• Select and modify paths.• Save and export paths.• Transform paths by using the Transform tools.9. Painting • Paint objects by using a specific tool. (tools include: Brush, History Brush, Art History, Pencil, Eraser)• Given an option from the Brushes palette, explain the purpose of that option. (options include: tip shape, brush presets, shape dynamics, air brush)• Create and use patterns.• Create and use Gradients.10. Retouching • Modify an image by using filters. (filters include: Unsharp Mask, Gaussian Blur, Despeckle, Noise)• Retouch an image by using a specific tool. (tools include: Healing Brush, Patch, Color Replacement, Clone Stamp, Dodge, Burn)11. Digital camera and video support • List and describe the options provided by Camera Raw features.• Explain the functionality provided by having non-square pixel support.EXAM TOPIC AREAS AND OBJECTIVES12. Working with automation• Create an action by using options in the Actions palette.• Play an action by using the Batch command including using File Browser as the source.• Explain the use and options for commands from the Automate menu.13. Working with type • Given a palette, explain options for type used in Photoshop. (palettes include Character and Paragraph; options include kerning, justification, alignment, anti-aliasing, multilingual dictionaries)• Create, enter, and edit type by using a type tool (type tools include: Horizontal Type, Vertical Type, Horizontal Type Mask, Vertical Type Mask)• Create and modify text on a path.14. Outputting to print • Given a scenario, select and explain when to use a specific Print command. (Print commands include: Print One Copy, Print Preview, Print)• Describe the options available for outputting to print in the Output and Color management pop-up menus in the Print dialog box.• Discuss issues associated with printing duotones.15. Outputting for the Web • Given a scenario, choose the appropriate file format to optimize images for the Web.• Create transparent and matted images by using the Save for Web command.• Explain how slices can be used to optimize images for the Web. (options include: layer based, user based, linking slices for optimization)EXAM TOPIC AREAS AND OBJECTIVESAdobe ® Photoshop CS 1.2 Create a layer comp by using Layer Comps palette.Which feature allows you to create, manage and view multiple versions of your document within a single file?A. Layer sets B. Layer comps C. History palette D. Alpha channels Correct answer: B 1.3 Manage libraries by using the Preset Manager.In which location should you save a Style library so that the library appears in the Styles palette menu?A. Presets/Styles folder inside the Photoshop program folder B. Plugins/Styles folder inside the Photoshop program folder C. Anywhere within the Photoshop program folder D. Presets folder inside the Photoshop program folder Correct answer: A 3.2 Modify selections by using a tool or command.You have made a color-based selection, but there are some stray pixels within the selection that were NOT selected. You want to check around selected pixels and add unselected pixels to the selection if the pixels fall within a specified range. Which command should you use?A. Select > Similar B. Select > Grow C. Select > Modify > Smooth D. Select > Modify > Expand Correct answer: C 3.4 Explain how to work with Alpha channels to save or modify a selection.Which feature allows you to create and store masks to manipulate, isolate and protect specific parts of an image?A. Layers B. Alpha channels C. Layer comps D. Quick Mask Correct answer: BPRACTICE EXAMTry out these helpful practice questions to get a feel forthe types of questions on the ACE exam, but please note that your performance here does not indicate how you’ll do on the actual exam. Tofully prepare for the exam, closely review the topic areas and objectives in the Exam Bulletin in this guide.3.5 Explain how to create and modify a temporary mask by using the QuickMask command.You are working in Quick Mask mode. You want to create a semi-transparent areafor an anti-aliased or feathered effect. What should you do?A. Paint with whiteB. Paint with blackPRACTICE EXAMC. Paint with grayD. Change the opacity of the Quick MaskCorrect answer: C4.1 Use the appropriate tools and commands to create and manage layers.You have made a selection on a layer. You want to copy the pixels in theselection to a new layer, and you want the pixels to be in the exact location as inthe original layer. What should you do?A. Choose Layer > New > Layer via CopyB. Using the Move tool, drag the selection to the New Layer icon in the Layerspalette.C. Click the New Layer icon in the Layers palette to automatically copy theselection to a new layer.D. Copy the selection, create a new layer, paste the selection, then chooseoptions from the Layer > Align Linked menu.Correct answer: A4.2 Edit layers by using the editing, vector, and painting tools.Which tool should you use to fix red eye in images?A. Color replacementB. Clone stampC. SpongeD. Magic EraserCorrect answer: A4.3 Use the appropriate tools and commands to create and modify clippingmasks, vector masks, layer masks, and layer sets.You want to mask out a portion of an image on a layer without permanentlyremoving the pixels. You want the mask to have soft edges for a smoothtransition with other layers. Which type of mask should you create?A. Quick maskB. Layer maskC. Vector maskD. Clipping maskCorrect answer: B4.5 Create layer styles.You are creating a layer style that you want to use for text. The layer styleconsists of a drop shadow and an inner bevel. You want the text to take on thecolor of the underlying image, rather than using the foreground color. Whatshould you do?A. Set the Opacity to 0%B. Change the Blend mode to overlayPRACTICE EXAMC. Set the Fill Opacity to 0%.D. Add a Color Overlay with an opacity of 0%.Correct answer: C5.2 Explain how to create, manage, and use channels.You are creating a spot color channel. You want to simulate a transparent ink,such as a clear varnish, that completely reveals the inks underneath. Whichoption would you modify?A. SolidityB. OpacityC. ColorD. Blend modeCorrect answer: A6.3 Configure color settings by using the Color Settings dialog box.When using color management, which describes the color space of differentdevices in a workflow?A. ProfilesB. Working spacesC. Color EngineD. Rendering IntentCorrect answer: A7.1 Analyze a histogram and explain changes that should be made to expandthe dynamic range of an image by using a Levels adjustment layer.You are using a Levels adjustment layer to expand the dynamic range of animage. What is the result of dragging the Black Input Level slider to 25?A. Values from 0 to 25 are mapped to 25, increasing contrastB. Values from 0 to 25 are mapped to 25, decreasing contrastC. Values from 0 to 25 are mapped to 0, increasing contrastD. Values from 0 to 25 are mapped to 0, decreasing contrastCorrect answer: C7.2 Explain how to adjust the tonal range of an image by using a Curvesadjustment layer.Which image adjustments command provides up to 14 control points forhighlight, midtone and shadow adjustments for individual channels?A. CurvesB. LevelsPRACTICE EXAMC. Color BalanceD. Brightness/ContrastCorrect answer: A7.4 Given an option on the Auto Color Correction Options dialog box, explainthe purpose of the option.Which setting in the Auto Color Correction Options dialog box increases contrastwhile preserving color?A. Enhance Monochromatic ContrastB. Enhance Per Channel ContrastC. Find Dark & Light ColorsD. Snap Neutral MidtonesCorrect answer: A8.1 Create shape layers and paths by using the Pen and Shape tools.Which tool should you use to draw a precise path around an object, clicking orclicking and dragging to place each point?A. Magnetic LassoB. Custom ShapeC. Freeform PenD. PenCorrect answer: D8.2 Select and modify paths.You have drawn a path with the pen tool. You want to select a single point sothat you can move the point. Which tool should you use?A. PenB. MoveC. Direct SelectionD. Path SelectionCorrect answer: C8.3 Save and export paths.You have created a path that you will use repeatedly in a logo for a client. Youwant to be able to quickly access and use this path from within any Photoshopfile. What should you do?A. Save the path in the file in which it was createdB. Export the pathPRACTICE EXAMC. Define the path as a Custom ShapeD. Define the path as a StyleCorrect answer: C9.1 Paint objects by using a specific toolWhich tool lets you paint with stylized strokes, simulating the texture of paintingwith different colors and artistic styles?A. Art history brushB. History brushC. AirbrushD. Pattern stampCorrect answer: A9.4 Create and use Gradients.You want to create a gradient that contains randomly distributed colors within arange of colors that you specify. What should you do?A. Create a noise gradientB. Create a solid gradient and choose Restrict ColorsC. Create a solid gradient and add additional color stopsD. Choose the Dither option when you draw the gradientCorrect answer: A10.2 R etouch an image by using a specific tool. (tools include: Healing Brush,Patch, Color Replacement, Clone Stamp, Dodge, Burn)You want to selectively lighten an area of your image to enhance the highlights.Which tool should you use?A. DodgeB. Color ReplacementC. SpongeD. SharpenCorrect answer: A11.2 List and describe the options provided by Camera Raw features.Which option in the Camera Raw dialog box reduces grayscale noise?A. TintB. SharpnessC. Luminance SmoothingD. Color Noise ReductionCorrect answer: C12.1 Create an action by using options in the Actions palette.You are recording an action. During the playback of the action, you want to beable to perform a manual task that cannot be recorded. What should you do?A. Insert a stop at the point where the manual task will be performedB. Insert a menu item at the point where the manual task will be performedC. Set the Action Playback Options to Step by StepPRACTICE EXAMD. Set the Action Playback Options to PauseCorrect answer: A13.3 Create and modify text on a path.You want to create text on a path so that the text orientation is parallel to thebaseline. What should you do?A. Use the Horizontal type toolB. Use the Vertical type toolC. Enter the text, select all using the Type tool, and rotate.D. Enter the text, select the text using the Direct Selection tool, and rotateCorrect answer: B14.2 D escribe the options available for outputting to print in the Output andColor management pop-up menus in the Print dialog box.While printing, you want to reduce the jagged appearance of a low-resolutionimage by automatically resampling up. Which option should you choose?A. InterpolationB. Anti-aliasingC. ScreenD. TransferCorrect answer: A15.3 Explain how slices can be used to optimize images for the Web. (optionsinclude: layer based, user based, linking slices for optimization)Which feature should you use to assign multiple optimization settings to a singleWeb image, so that flat areas of color can have a different optimization settingthan continuous tone areas?A. SlicesB. Image mapsC. RolloversD. Data SetsCorrect answer: AAdobe Systems Incorporated345 Park Avenue, San Jose, CA 95110-2704 USA Adobe, the Adobe logo, Acrobat, Adobe Premiere, After Effects, FrameMaker, GoLive, Illustrator, InDesign, PageMaker, and Photoshop are either registered trademarks or trademarks of Adobe Systems Incorporated in the United Statesand/or other countries. All other trademarksare the property of their respective owners.。
Image scanning apparatus and method for aligning a
专利名称:Image scanning apparatus and method foraligning a stack of scanned images using thestack orientation indicated by a user and anautomatically determined image orientation发明人:Shunichi Megawa,Masaaki Yasunaga申请号:US12032113申请日:20080215公开号:US08203763B2公开日:20120619专利内容由知识产权出版社提供专利附图:摘要:An image scanning apparatus according to the invention includes: a scanning unitconfigured to scan each of plural images from plural originals that are inputted; a rotation designating unit configured to designate rotation of the plural images that are scanned, so that an orientation of the plural images that are scanned coincides with a desired image orientation; a first image rotating unit configured to rotate the image that is scanned, in accordance with the designated rotation of the image; an orientation determining unit configured to determine whether the image orientation rotated by the first image rotating unit coincides with the desired image orientation or not; and a second image rotating unit configured to, for an image determined as not coinciding by the orientation determining unit, further rotate the orientation of the image so that the orientation coincides with the desired image orientation.申请人:Shunichi Megawa,Masaaki Yasunaga地址:Tagata-gun JP,Mishima JP国籍:JP,JP代理机构:Turocy & Watson, LLP更多信息请下载全文后查看。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Ca2+ mediates a negative feedback
This negative feedback: • Speeds up the recovery of the rod, improving the temporal resolution • Extends the dynamic range of the rod to higher light intensities
Inactivation: transducin a* inactivation
GTP hydrolysis through intrinsic GTPase activity inactivates a*
Inactivation: acceleration of transducin a* inactivation
Amplification: one R* activates tens of transducins
Activation:
transducin a* binds 1:1 to the inhibitory g subunit of the PDE
binding of transducin a* to PDE-g relieves the inhibition from the PDE catalytic subunits and so cGMP hydrolysis increases
Overview: Light activation
Overview: Inactivation
RK: rhodopsin kinase
Arr: arrestin
Ri: inactivated rhodopsin (phosphorylated)
Overview: Inactivation
Rod outer segment structure
• Overview • Activation • Inactivation • Ca2+ Feedback
Rod outer segment structure
Dowling (1987)
Overview: In the dark
Rh: rhodopsin
T: transducin
GC: guanylate cyclase
Monkey rod and cone
Schnapf and Baylor (Scientific American, 1987)
Phototransduction
• Phototransduction is the conversion by the photoreceptor cells of incoming light to an electrical signal
GPCRs: targets of >50% of therapeutic agents
Marinissen & Gutkind (TIPS, 2001)
Vertebrate Eye
retina
Human retina
adapted from Dowling (1987)
Two photoreceptor cell types: rod and cone
• Rhodopsin: seven transmembrane helix receptor. The prototypical GPCR, and the only one with known crystal structure. Composed of a chromophore, 11-cis retinal, bound to an apoprotein, opsin. Absorption of a photon isomerizes the chromophore to all-trans, leading to the generation of an activated state. Cone visual pigments contain different kinds of opsin, and so have different spectral sensitivities.
Ca2+ modulates several of the phototransduction components
• Guanylate Cyclase: Ca2+ inhibits the stimulating effect of the Guanylate Cyclase Activating Protein (GCAP)
Outer segment
The visual pigment rhodopsin
a membrane protein
contains 11-cis retinal, derived from vitamin A
Rhodopsin
Rhodopsin
Rhodopsin
Seven transmembrane helices Retinal is bound to Lys296
(supports the adaptation of the rod to light)
GPCRs: targets of >50% of therapeutic agents
Marinissen & Gutkind (TIPS, 2001)
Phototransduction Components
Crystal structure solved
Palczewski et al (Science, 2000)
Rhodopsin activation
Rhodopsin is the primary model for G-Protein-Coupled Receptors (GPCR)
Phototransduction Topics
Inactivation: R* inactivation
first step: phosphorylation by rhodopsin kinase
Inactivation: R* inactivation
second step: binding of arrestin to phosphorylated rhodopsin
PDE: cGMP-phosphodiesterase: a and b catalytic, g inhibitory subunits
Overview: Light activation
R*: light-activated rhodopsin a*: activated transducin a-subunit (with bound GTP)
RGS9: It speeds up GTP hydrolysis by transducin a*
Regulator of G-protein Signaling (RGS) GTPase Activating Protein (GAP)
Note: RGS9 activity requires two other proteins, Gb5L and R9AP (anchoring protein)
Ca2+ feedback: The inhibitory effects of Ca2+, before its
concentration has decreased
Ca2+ feedback: The decrease in Ca2+ antagonizes the
effect of light
Phototransduction as a model for signal transduction
Yiannis Koutalos Departments of Ophthalmology and Neurosciences
Medical University of South Carolina
Rod
for night vision
Cone
for day vision, color
Outer segment structure
Rod
Outer segment
photoreceptor
Electron micrographs
Kroll & Machemer
(1968)
s from single cells
Light flashes transiently suppress the current flowing into the outer segment
Cone photoreceptors
Cones are faster and less sensitive than rods – they operate in bright light
Kroll & Machemer (1968)
Activation:
light has activated a rhodopsin molecule
Activation:
R* diffuses on the disk surface, binds to inactive transducin
Activation:
Light-sensitive current
The effect of light
Membrane current of single rod outer segments
Suction pipette recording: Baylor, Lamb and Yau (J. Physiol., 1979)
R* catalyzes the exchange of GDP for GTP on transducin a
Activation