Thursday, 13 November 2008
Clustering: Self-Organizing Map (SOM)
There are two particularly useful purposes for this: visualization and cluster analysis. Visualization has typically been a difficult matter for high-dimensional data. SOMs can be used to explore the groupings and relations within such data by projecting the data on to a two-dimensional image that clearly indicates regions of similarity. Even if visualization is not the goal of applying SOM to a dataset, the clustering ability of the SOM is very useful.
Tuesday, 5 August 2008
MPEG 7 Standard
The elements that MPEG-7 standardizes provide support to a broad range of applications (for example, multimedia digital libraries, broadcast media selection, multimedia editing, home entertainment devices, etc.). MPEG-7 will also make the web as searchable for multimedia content as it is searchable for text today. This would apply especially to large content archives, which are being made accessible to the public, as well as to multimedia catalogues enabling people to identify content for purchase.
OBJECTIVES OF MPEG-7 STANDARD
The MPEG-7 standard aims at providing standardized core technologies allowing description of audiovisual data content in multimedia environments. Audiovisual data content that has MPEG-7 data associated with it, may include: still pictures, graphics, 3D models, audio, speech, video, and composition information about how these elements are combined in a multimedia presentation (scenarios). Special cases of these general data types may include facial expressions and personal characteristics .
Thursday, 17 July 2008
Geometric meaning of SVD
From Wikimedia
Because U and V are unitary, we know that the columns u1,...,um of U yield an orthonormal basis of Km and the columns v1,...,vn of V yield an orthonormal basis of Kn (with respect to the standard scalar products on these spaces).
The linear transformation T :Kn → Km that takes a vector x to Mx has a particularly simple description with respect to these orthonormal bases: we have T(vi) = σi ui, for i = 1,...,min(m,n), where σi is the i-th diagonal entry of Σ, and T(vi) = 0 for i > min(m,n).
The geometric content of the SVD theorem can thus be summarized as follows: for every linear map T :Kn → Km one can find orthonormal bases of Kn and Km such that T maps the i-th basis vector of Kn to a non-negative multiple of the i-th basis vector of Km, and sends the left-over basis vectors to zero. With respect to these bases, the map T is therefore represented by a diagonal matrix with non-negative real diagonal entries.
To get a more visual flavour of singular values and SVD decomposition —at least when working on real vector spaces— consider the sphere S of radius one in Rn. The linear map T maps this sphere onto an ellipsoid in Rm. Non-zero singular values are simply the lengths of the semi-axes of this ellipsoid. Especially when n=m, and all the singular values are distinct and non-zero, the SVD decomposition of the linear map T can be easily analysed as a succession of three consecutive moves : consider the ellipsoid T(S) and specifically its axes ; then consider the directions in Rn sent by T onto these axes. These directions happen to be mutually orthogonal. Apply first an isometry v* sending these directions to the coordinate axes of Rn. On a second move, apply an endomorphism d diagonalized along the coordinate axes and stretching or shrinking in each direction, using the semi-axes lengths of T(S) as stretching coefficients. The composition d o v* then sends the unit-sphere onto an ellipsoid isometric to T(S). To define the third and last move u, just apply an isometry to this ellipsoid so as to carry it over T(S). As can be easily checked, the composition u o d o v* coincides with T.
Endomorphism
Endomorphism
In mathematics, an endomorphism is a morphism (or homomorphism) from a mathematical object to itself. For example, an endomorphism of a vector space V is a linear map ƒ: V → V and an endomorphism of a group G is a group homomorphism ƒ: G → G, etc. In general, we can talk about endomorphisms in any category. In the category of sets, endomorphisms are simply functions from a set S into itself.
In any category, the composition of any two endomorphisms of X is again an endomorphism of X. It follows that the set of all endomorphisms of X forms a monoid, denoted End(X) (or EndC(X) to emphasize the category C).
An invertible endomorphism of X is called an automorphism. The set of all automorphisms is a subgroup of End(X), called the automorphism group of X and denoted Aut(X). In the following diagram, the arrows denote implication:
| automorphism | ![]() | isomorphism |
![]() | ![]() | |
| endomorphism | ![]() | (homo)morphism |
Any two endomorphisms of an abelian group A can be added together by the rule (ƒ + g)(a) = ƒ(a) + g(a). Under this addition, the endomorphisms of an abelian group form a ring (the endomorphism ring). For example, the set of endomorphisms of Zn is the ring of all n × n matrices with integer entries. The endomorphisms of a vector space, module, ring, or algebra also form a ring, as do the endomorphisms of any object in a preadditive category. The endomorphisms of a nonabelian group generate an algebraic structure known as a nearring.
Operator theory
In any concrete category, especially for vector spaces, endomorphisms are maps from a set into itself, and may be interpreted as unary operators on that set, acting on the elements, and allowing to define the notion of orbits of elements, etc.
Depending on the additional structure defined for the category at hand (topology, metric, ...), such operators can have properties like continuity, boundedness, and so on. More details should be found in the article about operator theory.
Isometry
From Wikipedia
In mathematics, an isometry, isometric isomorphism or congruence mapping is a distance-preserving isomorphism between metric spaces. Geometric figures which can be related by an isometry are called congruent.
Isometries are often used in constructions where one space is embedded in another space. For instance, the completion of a metric space M involves an isometry from M into M', a quotient set of the space of Cauchy sequences on M. The original space M is thus isometrically isomorphic to a subspace of a complete metric space, and it is usually identified with this subspace. Other embedding constructions show that every metric space is isometrically isomorphic to a closed subset of some normed vector space and that every complete metric space is isometrically isomorphic to a closed subset of some Banach space.
Definitions
The notion of isometry comes in two main flavors: global isometry and a weaker notion path isometry or arcwise isometry. Both are often called just isometry and one should determine from context which one is intended.
Let X and Y be metric spaces with metrics dY and dX. A map ƒ : X → Y is called distance preserving if for any x,y ∈ X one has
A distance preserving map is automatically injective.
A global isometry is a bijective distance preserving map. A path isometry or arcwise isometry is a map which preserves the lengths of curves (not necessarily bijective).
Two metric spaces X and Y are called isometric if there is an isometry from X to Y. The set of isometries from a metric space to itself forms a group with respect to function composition, called the isometry group.
Tuesday, 15 July 2008
GOP (Group Of Pictures)
In MPEG encoding, a group of pictures, or GOP, specifies the order in which intra-frames and inter frames are arranged.
The GOP is a group of successive pictures within an MPEG-coded video stream. Each MPEG-coded video stream consists of successive GOPs. From the MPEG pictures contained in it the visible frames are generated.
A GOP can contain the following picture types:
I-picture or I-frame (intra coded picture) reference picture, corresponds to a fixed image and is independent of other picture types. Each GOP begins with this type of picture.
P-picture or P-frame (predictive coded picture) contains motion-compensated difference information from the preceding I- or P-frame.
B-picture or B-frame (bidirectionally predictive coded picture) contains difference information from the preceding and following I- or P-frame within a GOP.
D-picture or D-frame (DC direct coded picture) serves the fast advance.
A GOP always begins with an I-frame. Afterwards several P-frames follow, in each case with some frames distance. In the remaining gaps are B-frames. With the next I-frame a new GOP begins.
The GOP structure is often referred by two numbers, for example M=3, N=12. The first one tells the distance between two anchor frames (I or P). The second one tells the distance between two full images (I-frames), it is the GOP length. For the above example, the GOP structure is IBBPBBPBBPBB. Instead of the M parameter one can use the maximal count of B-frames between two consecutive anchor frames.
The more I-frames the MPEG stream has, the more it is editable. However, having more I-frames increases the stream size. In order to save bandwidth and disk space, videos prepared for internet broadcast often have only one I-frame per GOP.
The I-frames contain the full image, they don't require any additional information to reconstruct the image. Therefore any errors in the streams are corrected by the next I-frame (an error in the I-frame propagates until the next I-frame). Errors in the P-frames propagate until the next anchor frame (I or P). B-frames do not propagate errors.
Monday, 14 July 2008
How to Write a Video Player in Less Than 1000 Lines
http://www.dranger.com/ffmpeg/



