Personal tools
You are here: Home Knowledge Map feature-extraction  
Views
  • State: visible

Content related to "feature-extraction"

(1) feature-extraction D5.4 Report on evaluation of methods
This document reports on the first evaluation of tools developed in the LIVE project for manual, semiautomatic and automatic annotation and extraction of knowledge in work package 5. We start this report with findings on the international TRECVID 2007 evaluation of LIVE tools for automatic shot boundary classification. The compressed domain shot boundary detector developed in the LIVE project showed the third best recognition performance of all 15 participating research groups in this competition. Despite the excellent results, the generalization of the performance from news and documentary data used in TRECVID 2007 to more difficult sports data produced by the LIVE streams of Olympia 2008 remains difficult. Only further evaluations on labelled data stemming from Olympia 2004 and the upcoming Olympia 2008 event will show how suitable the developed technology is for extracting information automatically from sports broadcasts – a domain, for which neither standard international benchmarks nor any international competition exist. The detection of gradual transitions in sports video must still be considered unsolved and need further research. However, the evaluation results of TRECVID show the potential of the developed technology and their maturity. The next section of this document deals with the performance of different face recognition methods which are developed in the LIVE project to identify athletes and other important persons in the video stream automatically. We measure the performance in rather controlled optimal situations, benchmarked on the Bochum gallery, but also on a “worse-case” gallery with rather mixed content. The result is promising but uncontrolled environment and incorrect feature correspondence lead to poor results – especially if more advanced P2D-HMMs face recognition technology is applied. Hence, component face detectors have been developed in the project in order to improve the correspondence search in pose estimation before any identification can be performed. We report in this document on the performance of several face component detectors for eyes, nose and mouth locations developed in the course of the project to improve face pose estimation and recognition. Despite the fact that the performance of individual face component detectors is quite high when evaluated on a test set stemming from the same database, generalization of the facial recognition algorithms to other more uncontrolled galleries remains a challenge. However, as the integration of the face component detectors in the face recognition framework is still lacking, no sound evaluation can be performed. We will report in an upcoming report D 5.7 on the results of our research and how the different algorithms perform on Olympia 2008 sports data during the field trial.
(1) feature-extraction A Block-Edge-Pattern-Based Content Descriptor in DCT Domain
In this correspondence, we describe a robust and effective content descriptor based on block-edge patterns extracted in discrete cosine transform domain, which is suitable for applications in JPEG or MPEG compressed images and videos. This content descriptor is constructed by a run-length edge-block histogram with three patterns including horizontal edge, vertical edge and no edge. In comparison with existing descriptors, the proposed features: 1) low-cost computing suitable for real-time implementation and high-speed processing of compressed videos; 2) robust to orientation changes such as rotation, noise, reverse, etc.; 3) operates in compressed domain. Extensive experiments support that the proposed content descriptor is effective in describing visual content, and achieves superior performances in terms of retrieval precision and recall rates.
(1) feature-extraction Afuzzy logic approach for detection of video shot boundaries
Video temporal segmentation is normally the first and important step for content-based video applications. Many features including the pixel difference, colour histogram, motion, and edge information etc. have been widely used and reported in the literature to detect shot cuts inside videos. Although existing research on shot cut detection is active and extensive, it still remains a challenge to achieve accurate detection of all types of shot boundaries with one single algorithm. In this paper, we propose a fuzzy logic approach to integrate hybrid features for detecting shot boundaries inside general videos. The fuzzy logic approach contains two processing modes, where one is dedicated to detection of abrupt shot cuts including those short dissolved shots, and the other for detection of gradual shot cuts. These two modes are unified by a mode-selector to decide which mode the scheme should work on in order to achieve the best possible detection performances. By using the publicly available test data set from Carleton University, extensive experiments were carried out and the test results illustrate that the proposed algorithm outperforms the representative existing algorithms in terms of the precision and recall rates.  2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
(1) feature-extraction Video Indexing and Retrieval in Compressed Domain Using Fuzzy-Categorization
There has been an increased interest in video indexing and retrieval in recent years. In this work, indexing and retrieval system of the visual contents is based on feature extracted from the compressed domain. Direct possessing of the compressed domain spares the decoding time, which is extremely important when indexing large number of multimedia archives. A fuzzycategorizing structure is designed in this paper to improve the retrieval performance. In our experiment, a database that consists of basketball videos has been constructed for our study. This database includes three categories: fullcourt match, penalty and close-up. First, spatial and temporal feature extraction is applied to train the fuzzy membership functions using the minimum entropy optimal algorithm. Then, the max composition operation is used to generate a new fuzzy feature to represent the content of the shots. Finally, the fuzzy-based representation becomes the indexing feature for the content-based video retrieval system. The experimental results show that the proposal algorithm is quite promising for semantic-based video retrieval.
(1) feature-extraction Extracting Objects and Events from MPEG Sequences for Video Highlights Indexing and Retrieval
Automatic recognition of highlights from videos is a fundamental and challenging problem for content-based indexing and retrieval applications. In this paper, we propose techniques to solve this problem using knowledge supported extraction of semantics, and employing compressed-domain processing for efficiency. Firstly, knowledge-supported rules are utilized for shot detection on the extracted DC-images, and statistical skin detection is applied for human object detection. Secondly, through filtering outliers in motion vectors, improved detection of camera motions like zooming, panning and tilting are achieved. High-level semantics like video highlights are then automatically extracted via low-level analysis in the detection of human objects and camera motion events, and finally these highlights are taken for shot-level annotation, indexing and retrieval. Results from large data of test videos have demonstrated the accuracy and robustness of the proposed techniques.
(1) feature-extraction Shot Boundary Detection in MPEG Videos using Local and Global Indicators
Shot boundary detection (SBD) plays important roles in many video applications. In this paper, we describe a novel method on SBD operating directly in compressed domain. Firstly, several local indicators are extracted from MPEG macroblocks, and AdaBoost is employed for feature selection and fusion. The selected features are then used in classifying candidate cuts into five sub-spaces via pre-filtering and rule-based decision making. Following that, global indicators of frame similarity between boundary frames of cut candidates are examined using phase correlation of DC-images. Gradual transitions like fade, dissolve and combined shot cuts are also identified. Experimental results on the test data from TRECVID’07 have demonstrated the effectiveness and robustness of our proposed methodology.
(1) feature-extraction Hierarchical Modeling and Adaptive Clustering for Real-time Summarization of Rush Videos
In this paper, we provide detailed descriptions of a proposed new algorithm for video summarization, which are also included in our submission to TRECVID’08 on BBC rush summarization. Firstly, rush videos are hierarchically modeled using the formal language technique. Secondly, shot detection are applied to introduce a new concept of V-unit for structuring videos in line with the hierarchical model, and thus junk frames within the model are effectively removed. Thirdly, adaptive clustering is employed to group shots into clusters to determine retakes for redundancy removal. Finally, each most representative shot selected from every cluster is ranked according to its length and sum of activity level for summarization. Competitive results have been achieved to prove the effectiveness and efficiency of our techniques, which are fully implemented in the compressed domain. Our work does not require high-level semantics such as object detection and speech/audio analysis which provides a more flexible and general solution for this topic.
(1) feature-extraction Subspace Extension to Phase Correlation Approach for Fast Image Registration
A novel extension of phase correlation to subspace correlation is proposed, in which 2-D translation is decomposed into two 1-D motions thus only 1-D Fourier transform is used to estimate the corresponding motion. In each subspace, the first two highest peaks from 1-D correlation are linearly interpolated for subpixel accuracy. Experimental results have shown both the robustness and accuracy of our method.
(1) feature-extraction Statistical Classification of Skin Color Pixels from MPEG Videos
Detection and classification of skin regions plays important roles in many image processing and vision applications. In this paper, we present a statistical approach for fast skin detection in MPEG-compressed videos. Firstly, conditional probabilities of skin and non-skin are extracted from manual marked training images. Then, candidate skin pixels are identified using the Bayesian maximum a posteriori decision rule. An optimal threshold is then obtained by analysis of probability error on the basis of the likelihood ratio histogram of skin and nonskin pixels. Experiments from sequences with varying illuminations have demonstrated that effectiveness of our approach.
(1) feature-extraction COMPRESSED-DOMAIN SHOT BOUNDARY DETECTION USING FINITE STATE MACHINE AND CONTENT-BASED RULES
We propose a fast and systematic method for shot boundary detection in compressed domain using content-based rules and FSM (finite state machine). Firstly, several feature indicators are acquired from DC images in MPEG videos including luminance, color, edge, prediction error and inter-frame difference as well as motion. Then, several content-based rules are utilized to detect abrupt cuts. Thirdly, boundaries of gradual transitions are determined by a coarse to fine procedure with a pre-processing module and a FSM. According to the experiments using publicly available sequences from TRECVID, the results have showed that the proposed algorithm outperforms the representative existing algorithms in both precision rate and recall rates.
(1) feature-extraction Recognition of JPEG Compressed Face Images Based on AdaBoost
This paper presents an advanced face recognition system based on AdaBoost algorithm in the JPEG compressed domain. First, the dimensionality is reduced by truncating some of the block-based DCT coefficients and the nonuniform illumination variations are alleviated by discarding the DC coefficient of each block. Next, an improved AdaBoost.M2 algorithm which uses Euclidean Distance(ED) to eliminate non-effective weak classifiers is proposed to select most discriminative DCT features from the truncated DCT coefficient vectors. At last, the LDA is used as the final classifier. Experiments on Yale face databases show that the proposed approach is superior to other methods in terms of recognition accuracy, efficiency, and illumination robustness.
(1) feature-extraction Description of Online and Offline Metadata Extraction out of Sports Videos
We focus on online and offline metadata extraction and annotation out of sports videos. The main benefit of our method is immediate and automatic extraction and annotation of metadata by giving semantics to combinations of heterogeneous low-level visual features. It brings new opportunities for efficient utilisation of sports video in improved ways, and is easily customized to address the characteristics. Firstly, semantic scene classification is described, including key-frames extraction, similarities determination between shots, and rule based estimation of scene boundaries. Secondly, fuzzy logic based categorizing is presented, including paradigm, Fuzzy membership function, and fuzzy feature generation and similarity measure. Thirdly, automatic sports video annotation is proposed, including robust dominant colour region detection, combined motion feature analysis. This work has been evaluated in the TRECVID 2007 competition.
(1) feature-extraction A New Robust Watermarking Scheme for Color Image in Spatial Domain
This paper presents a new robust watermarking scheme for color image based on a block probability in spatial domain. A binary watermark image is permutated using sequence numbers generated by a secret key and Gray code, and then embedded four times in different positions by a secret key. Each bit of the binary encoded watermark is embedded by modifying the intensities of a non-overlapping block of 8*8 of the blue component of the host image. The extraction of the watermark is by comparing the intensities of a block of 8*8 of the watermarked and the original images and calculating the probability of detecting '0' or '1'. Tested by benchmark Stirmark 4.0, the experimental results show that the proposed scheme is robust and secure against a wide range of image processing operations.
(1) feature-extraction Camera Motion Analysis towards Semantic-based Video Retrieval in Compressed Domain
To reduce the semantic gap between low-level visual features and the richness of human semantics, this paper proposes new algorithms, by virtue of the combined camera motion descriptors with multi-threshold, to automatically retrieve the semantic concepts, i.e., close-up, and panorama, directly in MPEG compressed domain based on camera motion analysis. Extensive experiments illustrate that the proposed algorithms provide promising retrieval results under real-time application scenario and without human intervention
(1) feature-extraction Face Detection based on Skin Color in Image by Neural Networks
Face detection is one of the challenging problems in the image processing. A novel face detection system is prsented in this paper. The approach relies on skin-based color features extracted from two dimentional Discreate Cosine Transfer (DCT) and neural networks, which can be used to detect faces by using skin color from DCT coefficient of Cb and Cr feature vectors. This system contains the skin color which is the main feature of faces for detection, and then the skin face candidate is examined by using the neural networks, which learn from the feature of faces to classify whether the original image includes a face or not. The processing is based on normalization and Discreate Cosin Transfer. Finally the classification based on neural networks approch. The expriment results on upright frontal color face images from the internt show an exellent detection rate.
(1) feature-extraction Real-time and Automatic Close-up Retrieval from Compressed Videos
In this paper, we propose a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from MPEG compressed videos based on camera motion analysis. In the retrieval process, we build camera-motion-based semantic retrieval. To improve the coverage of the proposed scheme, we investigate close-up retrieval in all kinds of videos. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario.
(1) feature-extraction Robustness Analysis on Facial Image Description in DCT Domain
In this letter, we report a DCT domain analysis of facial images to reveal that, when certain number of DCT coefficients are removed, the corresponding facial image description by the remaining DCT coefficients becomes robust to lighting changes and scale variations. Such nice properties would be very useful for applications of face recognition, video object tracking, object segmentation and visual content processing.
(1) feature-extraction A Block-Edge-Pattern based Content Descriptor in DCT Domain
In this correspondence, we describe a robust and effective content descriptor based on block edge patterns extracted directly in DCT domain, which is suitable for applications in JPEG or MPEG compressed images and videos. This content descriptor is constructed by a run-length edge-block histogram with three patterns including horizontal edge, vertical edge and no-edge. In comparison with existing descriptors, the proposed features: (i) low-cost computing suitable for real-time implementation and high-speed processing of compressed images or videos; (ii) robust to orientation changes such as rotation, noise, reverse etc. (iii) directly operates in compressed domain. Extensive experiments support that the proposed content descriptor is effective in describing visual content. In comparison with existing techniques, the proposed descriptor achieves superior performances in terms of retrieval precision and recall rates.
(1) feature-extraction Face Detection based Neural Networks using Robust Skin Color Segmentation
This paper proposes a robust schema for face detection system via Gaussian mixture model to segment image based on skin color. After skin and non skin face candidates’ selection, features are extracted directly from discrete cosine transform (DCT) coefficients computed from these candidates. Moreover, the back-propagation neural networks are used to train and classify faces based on DCT feature coefficients in Cb and Cr color spaces. This schema utilizes the skin color information, which is the main feature of face detection. DCT feature values of faces, representing the data set of skin/non-skin face candidates obtained from Gaussian mixture model are fed into the back-propagation neural networks to classify whether the original image includes a face or not. Experimental results shows that the proposed schema is reliable for face detection, and pattern features are detected and classified accurately by the backpropagation neural networks.
(1) feature-extraction Extracting Semantics and Content Adaptive Summarisation for Effective Video Retrieval
In this paper, we provide a system for semantic video retrieval in which extracted semantic contents are used to generate summarised videos for effective delivery of retrieved results. Firstly, several useful features are extracted in compressed video on the basis of the DC-images and motion vectors. Secondly, shot changes are detected to enable shot-level content indexing and retrieval. Thirdly, several semantics concepts are automatically detected including outdoor/indoor scenes, building, sky and human objects. The results of detected shots and extracted semantic concepts are then used for semantic indexing of video contents. Furthermore, a combined measurement is produced from these semantics for content adaptive video summarisation. According to the network performance, the retrieved video can be delivered at various sizes using our summarisation techniques for efficiency.
(1) feature-extraction MPEG-2 Compressed-Domain Algorithms for Video Analysis
This paper presents new algorithms for extracting metadata from video sequences in the MPEG-2 compressed domain. Three algorithms for efficient low-level metadata extraction in preprocessing stages are described. The first algorithm detects camera motion using the motion vector field of an MPEG-2 video. The second method extends the idea of motion detection to a limited region of interest, yielding an efficient algorithm to track objects inside video sequences. The third algorithm performs a cut detection using macroblock types and motion vectors.
(1) feature-extraction Analysis of cluttered scenes using an elastic matching approach for stereo images.
We present a system for the automatic interpretation of cluttered scenes containing multiple partly occluded objects in front of unknown, complex backgrounds. The system is based on an extended elastic graph matching algorithm that allows the explicit modeling of partial occlusions. Our approach extends an earlier system in two ways. First, we use elastic graph matching in stereo image pairs to increase matching robustness and disambiguate occlusion relations. Second, we use richer feature descriptions in the object models by integrating shape and texture with color features. We demonstrate that the combination of both extensions substantially increases recognition performance. The system learns about new objects in a simple one-shot learning approach. Despite the lack of statistical information in the object models and the lack of an explicit background model, our system performs surprisingly well for this very difficult task. Our results underscore the advantages of view-based feature constellation representations for difficult object recognition problems.
(1) feature-extraction Fusion of intensity and channel difference for improved colour edge detection
Edge detection, especially from colour images, plays very important roles in many applications for image analysis, segmentation and recognition. In this paper, a new colourgray mapping method for effective colour edge detection is proposed. From any given colour image C, a gray image D is defined as the accumulative differences between each of its two colour channels, and another gray image R is then obtained by weighting of D and gray intensity image G. Fusion of edges extracted from R and G forms the final results. Comparing with edges detected from traditional colour spaces like RGB, YCbCr and HSV, all using same Canny operator, it seems the proposed method can achieve more effective results from different test images.
(1) feature-extraction DCT-Domain Image Retrieval Via Block-Edge-Patterns
A new algorithm for compressed image retrieval is proposed in this paper based on DCT block edge patterns. This algorithm directly extract three edge patterns from compressed image data to construct an edge pattern histogram as an indexing key to retrieve images based on their content features. Three feature-based indexing keys are described, which include: (i) the first two features are represented by 3-D and 4-D histograms respectively; and (ii) the third feature is constructed by following the spirit of run-length coding, which is performed on consecutive horizontal and vertical edges. To test and evaluate the proposed algorithms, we carried out two-stage experiments. The results show that our proposed methods are robust to color changes and varied noise. In comparison with existing representative techniques, the proposed algorithms achieves superior performances in terms of retrieval precision and processing speed.
(1) feature-extraction An Effective and Fast Scene Change Detection Algorithm for MPEG Compressed Videos
In this paper, we propose an effective and fast scene change detection algorithm directly in MPEG compressed domain. The proposed scene change detection exploits the MPEG motion estimation and compensation scheme by examining the prediction status for each macro-block inside B frames and P frames. As a result, locating both abrupt and dissolved scene changes is operated by a sequence of comparison tests, and no feature extraction or histogram differentiation is needed. Therefore, the proposed algorithm can operate in compressed domain, and suitable for real-time implementations. Extensive experiments illustrate that the proposed algorithm achieves up to 94% precision for abrupt scene change detection and 100% for gradual scene change detection. In comparison with similar existing techniques, the proposed algorithm achieves superiority measured by recall and precision rates.
(1) feature-extraction An efficient face image retrieval through DCT features
This paper proposes a new simple method of DCT feature extraction that utilize to accelerate the speed and decrease storage needed in image retrieving process by the aim of direct content access and extraction from JPEG compressed domain. Our method extracts the average of some DCT block coefficients. This method needs only a vector of six coefficients per block over the whole image blocks In our retrieval system, for simplicity, an image of both query and database are normalized and resized from the original database based on the cantered position of the eyes, the normalized image equally divided into non overlapping 8X8 block pixel Therefore, each of which are associated with a feature vector derived directly from discrete cosine transform DCT. Users can select any query as the main theme of the query image. The retrieval images is the relevance between a query image and any database image, the relevance similarity is ranked according to the closest similar measures computed by the Euclidean distance. The experimental results show that our approach is easy to identify main objects and reduce the influence of background in the image, and thus improve the performance of image retrieval.
(1) feature-extraction Subsampling-based image watermarkng in compressed DCT domain
In this paper, a new embedding strategy for watermarking is presented based on DC components of subimages in compressed discrete cosine transform (DCT) domain. These subimages are obtained through subsampling the host image. More robustness has been achieved when watermarks are embedded in perceptually significant DC components. Furthermore, the original image is not required in the extraction process. Experimental results show that the proposed scheme successfully makes the watermark perceptually invisible and robust for a wide range of attacks, including JPEG-loss compression, filtering, scaling, and cropping attacks.
(1) feature-extraction Skin Detection from Different Color Spaces for Model-based Face Detection
Skin and face detection has many important applications in intelligent human-machine interfaces, reliable video surveillance and visual understanding of human activities. In this paper, we propose an efficient and effective method for frontal-view face detection based on skin detection and knowledge-based modeling. Firstly, skin pixels are modeled by using supervised training, and boundary conditions are then extracted for skin segmentation. Faces are further detected by shape filtering and knowledge-based modeling. Skin results from different color spaces are compared. In addition, experimental results have demonstrated our method robust in successful detection of skin and face re-gions even with variant lighting conditions and poses.
(1) feature-extraction University of Bradford at TRECVID 2008 Content Based Copy Detection Task
We present a novel method for spatial-temporal video copy detection based on adaptive masking. Firstly, a dedicated video analysis is implemented for input videos, which ensures the accurate detection of complicated distortions query videos may undergo. Secondly, simple signatures are extracted for the benefit of time and space efficiency, and the frame mask is generated adaptively to reduce video temporal redundancy. Thirdly, a matching process is implemented to find video copies. The proposed video copy detection framework is effective, and robust against spatial and temporal variations.
(1) feature-extraction Progressive content access to databases of JPEG compressed images
Progressive content access provides a mode that allows a coarse version of an image being viewed at a lower computing cost and then gradually refined by subsequent resolution enhancement if required. This proves extremely useful when millions of compressed images and video sequences need to be browsed manually or processed in pixel domain, saving the cost and removing the necessity of full decompression. In this paper, we propose such a progressive content access algorithm suitable for all DCT-based JPEG and MPEG compressed files. We first develop a theoretical model in approximation of cosine function used in IDCT with various orders. Following that, we then propose a progressive content access algorithm, which comprehends both the successive approximation and the spectral selection. Further analysis and experiments are reported to show that our proposed algorithm saves computational cost in comparison with JPEG full decompression. Extensive experiments also support that the proposed algorithm achieves encouraging PSNR values for reconstructed images even with lower order approximation.