In order to make predictions from our experiences of the world around us we often compare situations by counting off units of measurement, and if this doesn’t work exactly we count fractional units and use limits thereof to identify the quantities of interest with real numbers (or sets of real numbers when there are many related relevant measurements needed to describe the situation). The sets of numbers are just tools for describing things and there are many different ways of assigning them (eg using different units, or measuring different aspects of the situation) but oddly it is physicists rather than mathematicians who seem most obsessed with the numbers themselves rather than the actual quantities of physical interest.
In many situations there is a class of quantities which can be combined with one another by a process analogous to addition of numbers. One example is the combining of two massive objects into a single composite which has a mass equal to the sum of the masses of its components. If we allow both addition and subtraction of parts, then for any two masses there is a real multiple of one which exactly balances the other. Another situation is the combination of geometric displacements by applying one after the other to get a combined displacement (which we also call the sum). In this case, however, the balancing of any pair does not always work out. We can find many pairs of displacements such that no combination of nonzero multiples of them is the same as doing nothing (indeed this will always be the case unless they are parallel). But if we are given more than three such displacements then we can always find one of them which is balanced by a sum of multiples of the others – and in particular if we pick three that are not coplanar then any fourth will be a sum of multiples of them. (This fact is what we mean when we say that the space of displacements is three dimensional.) We refer to the three given displacements in terms of which others are expressed as forming a basis and for any other vector the numerical multipliers are called its coordinates with respect to that basis. Since the use of longer basis vectors would require smaller multiples we sometimes say that these coordinates are contravariant (since they vary in an opposite way to the basis vectors).
In mathematics any set of objects with operations of “addition” and “multiplication” by scalars which satisfy certain basic properties is called a vector space and the maximum number of vectors in a linearly independent set is called its dimension. The set of all possible displacements in physical space is a vector space of dimension 3, as are also the set of all possible forces acting at a point, and the set of all ordered triples of real numbers. Given any basis for a 3d vector space we can define a 1:1 correspondence between vectors in the space and triples of real numbers by just taking the coefficient triples needed to express the vectors in terms of the basis vectors. Since the use of longer basis vectors would require smaller multiples we sometimes refer to these as contravariant coordinates. Another way of associating a triple of real numbers with each vector would be to just take the geometric dot products. For the case of an orthonormal basis (consisting of perpendicular vectors of unit length) this would give the same coordinates as before, but if the basis vectors are not orthogonal or not unit vectors it gives something different and since in this case taking longer basis vectors gives bigger numbers the resulting numbers are called covariant coordinates.
Optional Aside: The relationship between covariant and contravariant coordinates goes beyond just scaling. If we change the choice of basis vectors, say from b1,b2,b3 to b1’=a11b1+a12b2+a13b3 and so on, then for a vector v=v1b1+v2b2+v3b3 the dot products with the primed basis are given by v.b1’=v.( a11b1+a12b2+a13b3 )= a11v.b1+a12 v. b2+a13 v. b3 etc so that the change can be described by matrix multiplication with [v.bi]=[aij][v.bj] whereas the expression of v in terms of the primed basis is given by solving from b1’=a11b1+a12b2+a13b3 to get bi=(a^-1)ijbj’ so v=v1’b1′ with vi’ given by [vi’]=[ (a^-1)ij ][vj]
Another example of covariant coordinates is the use of partial derivatives to describe the gradient of a scalar field in terms of the basis vectors for the space of displacements. In this case the use of larger (or smaller) units of measurement increases (or decreases) the per-unit change in the field values; and more generally, changing to a new basis with components relative to the old given by columns of a matrix M results in the new components of the gradient being given from the old by matrix multiplication by M.
Optional Aside: (Explicit calculation of gradient wrt transformed basis)
Of course, the gradient could also be assigned coordinates relative to a basis defined in terms of the scalar field itself (eg with lengths being defined in terms of changes in the field) and in that case the corresponding coordinates would be contravariant (since using larger field changes as the unit would require a smaller number of units for a given actual change).
Optional Aside: We could instead represent a gradient by the coefficients needed to express it in terms of three basic gradients defined as the gradients of three particular scalar fields <*?by limits of the change in the field corresponding to motion towards three particular target points as the scaling of the field (rather than the distances to the target points) goes to zero?*>. (More details on contravariant coords for gradient)
Because of the primary role we assign to position in space, it is the lengths and orientations of measuring rods in space that we take as fundamental and so it is relative to those that we consider how quantities vary.
Quantities, like the components of a gradient, defined as dot products with displacement basis vectors, are covariant (because increasing the length of the basis vector increases the amount by which the field changes from one end to the other) and quantities like coordinates of a vector, that are measured by what multiple of a basis vector they include are contravariant (because increasing the length of the basis vector means smaller multiples of it are required)