• Please use real names.

    Greetings to all who have registered to OPF and those guests taking a look around. Please use real names. Registrations with fictitious names will not be processed. REAL NAMES ONLY will be processed

    Firstname Lastname

    Register

    We are a courteous and supportive community. No need to hide behind an alia. If you have a genuine need for privacy/secrecy then let me know!
  • Welcome to the new site. Here's a thread about the update where you can post your feedback, ask questions or spot those nasty bugs!

Sensor color behavior and the DxO report

Doug Kerr

Well-known member
I will discuss in this series of notes some conundrums in the "color response" page of the DxOMark reports on camera sensor characteristics. In this first note, I will discuss some background matters.

The sRGB primaries

All "RGB family" color spaces are tristimulus color spaces (although many purists reserve that term for a particular color space of that class, the CIE XYZ color space). That means that the color of an instance of light is described by stating the amounts of three defined kinds of light, called the "primaries" of the color space, which,. if combined, would create light of the color of interest.

In the case of color spaces of the RGB family, including the specifically defined color space identified as sRGB", those three primaries are called R, G, and B.

Thus three primaries are defined in terms of their chromaticities and base luminances, given in terms of the CIE x ,y, Y color space.

Now we know that, for any given color of light, there are an infinite number of kinds of light, with different spectrums, that have that color. (Some wouild suggest that I say, "appear to the eye to have that color", but in fact the definition of color is such that if two instances of light "appear to the eye to have the same color", they are the same color.) This is the phenomenon of metamerism.

Now we might be tempted to ask, "Does not the complete definition of the three RGB primaries include a standard spectrum for each"? No, it does not. There is no need to do do.

The reason is that if we take two instances of light, both of known color (that is, known chromaticity and known luminance) and combine them, then the resulting light will have a predictable color. This is independent of how those two instances got their colors (what spectrums they had). Thus, we need not think in terms., for the definitions of the sRGB primaries, of anything beyond their colors (most important to us is the matter of their chromaticities.

Suppose that for some kind of test we wanted to make of a camera sensor, we wanted a machine that would emit, on command, "the sRGB primaries". Just what would that mean? Would we care what spectrums these three outputs had? The sensor response might depend on that (see below).

We can see that the notion of such a machine is very problematical.

Sensor behavior

It would be very desirable if, in our sensors:

A. the three outputs (which I call d, e, and f to avoid misunderstandings) actually described the color of the light hitting the sensor.​

But, certainly in all the cases I have heard of, that is not so.

We realize that (as mentioned above) there are an infinity of light spectrums that all have the same color (remembering that "color" is in fact defined in terms of the human response to the light of interest). This is the phenomenon of metamerism,

If in fact attribute A were true for our sensor (such a sensor is called colorimetric, meaning, "measures color), then for light of any spectrum having a certain color, the sensor d, e, and f outputs would be the same (and would tell us, consistently, the color of that light).

But, for the typical sensor, if we apply two instances of light having different spectrums but the same color (these would be called metamers), the sensor would in general deliver different sets of d, e, and f values. This is a shortcoming we can't overcome, but can "dodge" (more about that later).

Now if we had a sensor that did exhibit attribute A, it might be very handy if, in addition:

B. the sensor outputs, d, e, and f, would (perhaps after applying some fixed scaling to compensate for their differing "sensitivities") corresponded to the linear coordinates of the color space in which we intended to work. Perhaps they would correspond to the linear coordinates, r,. g, and b, of the sRGB color space.​

Then, after demosaicing (so we had a d, e, and f value for each pixel location), we could just take the d, e, and f values for any pixel and (perhaps after scaling) proceed directly to convert them (by applying "gamma precompensation") to the R, G, and B values (sRGB basis) of the pixel in the developed image.

Of course, since, for our sensors, attribute A does not obtain, then attribute B is moot. But the notion will play a role in what is to follow.

To slightly circle back to a point made in the previous section, suppose that for some reason, as part of a series of tests to investigate the behavior of a certain sensor, we decided to measure its outputs (d, e, and f values) when it was stimulated (separately), by "the three sRGB primaries". This at first sounds like it could be useful, and the results very informative.

But now we have to think more about just what that would mean. What kinds of light might we use as "the three sRGB primaries? What spectrums would they have (remembering that the primaries are only defined in terms of their colors, and there are an infinity of spectrums that would have a each of those colors)?

Maybe that wouldn't matter. Well, that wouild be true if the sensor were colorimetric (that is, if its d, e, and f outputs depended only on the color of the light, regardless of the spectrum that give it that color).

But we have just seen that for actual sensors the d,,e, and f outputs do not depend on the color of the light, independent of the spectrum that gave it that color.

So if we are going to test the response of the sensor when it is excited by "one of the sRGB primaries", what spectrums would we prescribe for the three "test lights"? And why those particular ones.

So maybe "testing the response of a sensor to excitation by 'the sRGB primaries' " is not actually a good idea.

Hold that thought.

[to be continued]

Best regards,

Doug
 

Asher Kelman

OPF Owner/Editor-in-Chief
For the rest of us!

Before you ask, "Do I need this?"

Well, when you try to get wonderful color, either matching close to what one perceives at the time or else conjures up to present some preferences of mood or meaning, one can benefit from knowing some basics about "the stream of data" made after the arrival of light at your camera's sensor creates a an electric charge in each individual photocell approximately proportional to the intensity of light shining into that photocell of silica. We don't need to know the physics but we can benefit from knowing the logical steps taken to translate a charge to some voltage signal that can be assembled to create color which can be described by 3 component parameters and mapped on a grid that then is remapped somehow to a cor space all our other equipment is built to receive and do something with, like create an an image on a screen or a picture on a print.

This is critical since some cameras might or might not alter the very first "color map" of your image in some way,(for exams to account for say a yellow color me of the glass in one of theiir lenses) or else package references to possible corrections that your Capture One, Photoshop, GIMP, DXO or other software might use.

On One, for example doesn't seem to have any capacity for taking into account non-neutral lenses, even for in camera jpg creation, whereas Adobe Camera RAW, (in Lightroom and Photoshop), has look up tables for many individual lenses separately for color corrections and geometric of chromatic aberration corrections.

So, my friends, reading Doug's essay might better prepare us for at least recognizing the value and shortcomings of various choices for workflow starting with so-called "RAW" files!

Asher
 

Doug Kerr

Well-known member
[Continued]

Part 2

The DXOMark reports

DxOMark is an operation of the well-respected testing laboratory DxO Labs. They publish technically-precise reports on the image behavior of numerous digital cameras. (These reports do not treat, for example, AF speed, burst rate, shape of the grip, and so forth.)

Under the tab "Measurements", the tab "Color Response" gives a fascinating report on the colorimetric behavior of the camera's sensor. But there are many conundrums in this report. This is the main topic of this series of notes.

Here is a typical such report, for the Canon EOS 40D. By the way, for each camera, there are two such pages, one predicated on the measurements being made under CIE illuminant D50 and the other predicated on the measurements being made under CIE illuminant A. We see the "D50" version.

DxOMark_Canon_40D_D50.jpg

From DxOMark, used here under the doctrine of fair use​

There are many "interesting" things on this page, and they link together in sometimes mysterious ways. I will move through them on a path I hope is useful.

The centerpiece of the discussion here is the leftmost set of three bar charts. (The fourth one covers a different, although - maybe - related matter, but ignore it for now.)

We will proceed by looking at the charts as if we did not have the benefit of the (not easily found) DxOMark explanation of their significance. That is, we would hope that they would be labeled so as to generally explain what is going on. But don't hold your breath.

Each of the charts is labeled, for example, "Sensor Red Channel (R raw)". We might reasonably assume that this refers to the sensor output I call "d" (to avoid , for example, the misunderstanding that this is the R coordinate, in the sRGB color space, of the color of the light falling on the sensor).

The three bars on each chart are labeled, collectively "sRGB primaries", the first one in each set being labeled "R sRGB".

The intimation is that these charts show the results of testing the sensor by exciting it (separately) with the "three sRGB primaries", and seeing for each what the relative values of the three sensor outputs are. Thus we see that excitation of the sensor by the sRGB G primary of some normalized potency would give an output of 0.39 units from the "R" channel, 0.84 units from the "G" channel, and 0.50 units from the "B" channel.

Now our first thought is that well, that speaks well of the "G channel, but not so well of the "R" and B" channels.

Except for a fallacy I will get to in a jiffy, that would be a reasonable interpretation if we actually looked to our senor to directly deliver an output in terms of r, g, and b, the linear values that are turned into the nonlinear R, G, and B values of the sRGB color space. But of course that would not be a precondition of the sensor being "colorimetric"; it wouild just be the most handy way for it to be colorimetric.

But in fact we learned in Part 1 of this series that there is no such thing, in terms of physical light, as a unique "sRGB primary - there are an infinite number of kinds of light that would all exhibit the color that is the sole defined characteristic of, for example, the sRGB R primary. S we could not make such a test.

And in fact if we could make such a test, the results would not tell us if the sensor is perfectly colorimetric, and if not, how far from that ideal was it, and in what way. (There is a way to probe that aspect of sensor behavior, which we will get to later.)

So this story is not yet reaching from cover to cover.

Now, indeed, the illuminant "under which the measurements are made", in this case CIE illuminant D50, does have a completely defined spectrum. And so if we would say that we had exposed our sensor to illuminant D50 and noted the three outputs (d, e, and f in my notation; R raw, G raw, and B raw in DxO's notation), that would be meaningful. And in fact that result is reported, on the fourth bar chart. (No, its labeling is not at all consistent with that. But there is a lot of that going around on this page.)

And, I point out that if we were to look at the "CIE illuminant A" version of this page, all three bar charts are different. But if the first three bar charts actually did report the result of the illumination of the sensor by "the three sRGB primaries", the results would not be different because of the choice of a certain illuminant for other measurements.

So we are faced with a conundrum as to just what test results are reported by the first three bar charts.

Now, the discussion of these bar charts in the DxO background material suggests that, since the three sensor outputs do not reflect only response to the corresponding sRGB primaries (that is, there is more than one non-zero bar on each chart), we must "back out" (my term) this lack of "purity" by taking the three outputs and multiplying them by a 3 × 3 matrix.

As they describe it, we set up the nine numbers on the three bar charts as a matrix, take the inverse of that matrix, and use that (in actual camera operation) to multiply the three sensor outputs to get the real sRGB coordinates of the color. This derived matrix is apparently the one we see on this report page labeled as "Color matrix as defined in ISO standard 17321".

As evidence of that, if I take the nine values presented on the three bar charts in this page, set that up as a matrix, and take its inverse, then, by golly, that is exactly the matrix shown under "Color matrix as defined in ISO standard 17321"

And the labeling of that matrix suggests that it is in fact intended to take a set of the three sensor outputs (R raw, etc.) and transform it into a set of three color space coordinates (R sRGB, etc.).

Of course, the R, G, and B values in an sRGB color are nonlinear versions of the underlying basic coordinates, often labeled r, g, and b, and the matrix could not possibly yield the nonlinear values R, G, and B. But by now we realize that, as best, this report plays fast and loose with notation.

In part 3 of this series, we will look into just what is ISO 17321 anyway. That's a little complicated, so let's break for a refreshing glass of something (iced tea here).

[To be continued]

Best regards,

Doug
 

Doug Kerr

Well-known member

Doug Kerr

Well-known member
[Continued]

Part 3

Some things about ISO 17321

I have emphasized before in this series and in other related essays that almost invariably our camera sensors are non-colorimetric. That means that their outputs (which I call d, e, and f for most of the sensors we are concerned with) do not reliably describe the color of the light falling on the sensor.

This shortcoming is often characterized as metameric error. To review, although the color of an instance of light is dictated by its spectrum, there are an infinity of different spectrums that will have the same color. This phenomenon is called metamerism. The different kinds of light (with different spectrums) that have the same color are called metamers.

With a non-colorimetric sensor, different metamers with the same color will in general have different sensor outputs, thus the term "metameric error".

For a sensor to be colorimetric, it must obey a set of criteria called the "Luther-Ives Conditions". It is impractical (at least being mindful of other objectives) to fulfill these. Thus we are stuck, to some degree, with this imperfection in the sensor "reporting" the color of the light on each of its pixels.

ISO Standard 17321-1, "Graphic technology and photography — Colour characterization of digital still
cameras (DSCs) — Part 1: Stimuli, metrology and test procedures", has as a major thrust the characterization of the metameric error behavior of a camera sensor. (We might more accurately say "metameric error potential" of a sensor, since the actual degree of metameric error is a function of exactly how we process the camera's raw data.)

The test procedure begins my taking a shot, under a certain illumin)ant. of a standardized test chart with multiple colored patches (specifically, an X-Rite ColorChecker chart (Only the "colored" - that is, 'non-neutral" - patches participate.)

Since we know the spectrum of the chosen illuminant, and know the reflective spectrum of each of the color patches, we can determine the spectrum of the light expected to be reflected from each, and from that, the color of that light.

Of course, we know that the three sensor outputs do not tell us the color of the light striking the sensor, and if we have in mind a representation in terms of some recognized color space, doesn't even come close. o in actual, use, we transform the sets of three sensor outputs (perhaps by simple multiplication by a 3 × 3 matrix) into the underlying linear coordinates of that color space.

The next step of the ISO 17321 measurement procedure mimics that practice. In it, using a mathematical algorithm,. we devise an "optimum" matrix for that operation. It is optimum in that, using it consistently to transform the set of three sensor outputs from each color patch into, for example the L*a*b* color space (by way of the CIE XYZ color space), and determining for each patch the discrepancy between that resulting color and what we expect the color of the light from that patch to be, the average of that discrepancy (overall the patches) is the least that can be achieved with any matrix.

Then. the actual magnitude of this "irreducible" average color error is used as the basis for the metric of metameric error, called the Sensitivity Metamerism Index (SMI). Its value reaches 100 ("perfect") if that irreducible average color error is in fact zero.

Does ISO 17321 suggest that, for a given sensor, that "optimum" matrix be used in the camera for transformation of the sensor outputs to the XYZ color space (and then perhaps on to the sRGB color space)?

No. The standard makes no recommendation as to camera design practice.

Back to the DxOMark report

Does DxOMark go through this process? Evidently, since a value of the SMI is reported.

Is the matrix presented with the label "Color matrix as defined in ISO standard 17321" the one that is derived as part of that process? Well, one would think.

But that matrix (defined in ISO 7321 as part of the SMI measurement process) is not defined as the inverse of a matrix that describe the response of the three sensor channels to "the three sRGB primaries" (and remember, we saw that there really can't be such a thing anyway). Rather, it comes from a much more complex process, involving the photography of 18 color patches.

Yet, if we look at one of the reports, we find that the "Color matrix as defined in ISO standard 17321" is always precisely the inverse of the matrix that corresponds to the nine values in the three bar charts. How can this be?

Well, I think that the matrix has in fact been constructed as part of the ISO 17321 SMI determination process. Then, the inverse of that (remember, the inversion of a matrix is a reversible process; if A is the inverse of B, then B is the inverse of A) is used to determine the values on the nine bars on the bar charts.

Now, having done that, what is the significance of those bar charts? Well, beats the hell out of me so far. But it certainly is not in any way revealed by the labeling (at least to an old telephone engineer).

I have a feeling that we may be dealing here with remnants of another kind of testing that may have been used by DxOMark at an earlier time.

I have sent a blind inquiry to DxOMark about this conundrum. No response so far.

-#-​

Best regards,

Doug
 

Doug Kerr

Well-known member
Suppose that we somehow knew that the sensor we planned to use in a new camera design was essentially colorimetric. Perhaps someone reliable had tested it and reported its SMI to be 100% ("zero average metameric error"), but we had no further information.

But that characterization means that its outputs were in fact the coordinates of a tristimulus color space (albeit not likely one we had ever heard of).

That would be great news. It means that we can transform the sensor outputs (d, e, f) to, for example, r, g, b (the linear coordinates of the sRGB color space) by simple multiplication by a 3 × 3 matrix, and have zero metameric error - the resulting r, g, b values (gamma-precompensated to R, G., B for actual use) would always reliably indicate the color of the light at each pixel of the image.

But how would we determine what that matrix should be? Well, we could shoot a test target of only three well-chosen color patches, whose reflective spectrums were known, under any handy standard illuminant, and note the sensor outputs for each. Then we wouild use the procedure and algorithm in ISO 17321 to construct our matrix (which would not just be "optimum" but in fact "perfect").

And of course if we calculated the SMI for this sensor (on such a limited text regimen), it would come out to be 100%.

Best regards,

Doug
 

Doug Kerr

Well-known member
Another interesting facet of the DxO sensor Color Response page is the matter of the relative sensitivity of the three sensor channels. We often hear (accurately) that in the typical sensor, the "R" and "B" channels are less sensitive than the "G" channel. Among other things, this has an implication in the noise in the various "channels" of the developed image.

But if we wanted to quantify the sensitivity of the three channels of a certain sensor, just what would we measure?

Ont thought that might come to mind is to measure the channel output when the sensor is excited by the "corresponding"" primary of the color space of interest. First, what does "corresponding:" mean? Well, I suppose, corresponding in name. After all, the three sensor channels are called (not by moi) "R", "G", and "B", and the three primaries of any given RGB-family color space are called "R". "G", and "B", so . . .

But we earlier realized that "the sRGB 'R' primary" (for example) does not have unique physical representation., That is, there are an infinite number of light spectrums that all exhibit the chromaticity specified for that primary. We know that the sensor "R" channel would have different responses to all those many metamers of the "R" primary.

So that won't work.

What DxOMark does is measure the response of each sensor channel when the sensor is excited by the illuminant chosen for the report.

For convenience, I will show again the illustrative DxO report page we say at the beginning of this series:

DxOMark_Canon_40D_D50.jpg

From DxOMark, used here under the doctrine of fair use​

The results of those measurements (in this case, for excitation of the sensor by illuminant C50) are presented in the rightmost of the three bar charts. The labeling of the chart of course gives no clue to what this is (it is in fact just plain "wrong").

The three bars in fact represent the response of the sensor's "R", "G", and "B" channels to excitation of the sensor by the pertinent illuminant (normalized so that the greatest value is 1.00).

Now there is often talk of "equalizing" the outputs of the sensor so as to cancel out this difference in "sensitivity". We would do that by, in actual operation, multiplying the three sensor outputs by the reciprocals of their respective relative sensitivities. And, whaddya know, these three reciprocal values are found on the report, in the panel labeled "White balance scales". ("White balance scaling factors" would be more apt, but they didn't ask me.)

So now what do we have? Well, the labeling of the table of factors gives a strong clue. If we "adjust" the sensor outputs in this way, then for a part of the sensor that receives light from a neutral object, the three outputs of the sensor will be equal.

That is, we have achieved "proper white balance" if we consider the (adjusted) sensor outputs to be the coordinates of a color space. Which we don't. Because they aren't.

We often hear it said that "We must, at least conceptually, 'adjust' the sensor outputs to compensate for the differences in channel sensitivity before we do any further processing of the sensor data". Hmm.

Well, let's reflect again on the workings of ISO 17321. Is the matrix that is developed (solely for purposed of determining the SMI, of course) used to multiply the "adjusted" sensor outputs to get what we will consider to be the coordinates of the color of the light on the sensor. No. There is in fact no mention in ISO 17321 of the "relative sensitivities of the sensor channels", or of the concept of adjusting the sensor outputs for same The infamous matrix is developed by calculations working on the sensor outputs for the various color patched - not the "adjusted" sensor outputs.

And the matrix is used to multiply the sensor outputs - not the "adjusted" sensor outputs - to get what we will consider to be the coordinates of the color of the light on the sensor.

But in fact we do sometimes make use of the "white balance scaling factors". For example, in Canon EOS cameras, we find in the proprietary part of the Exif metadata a table with entries for all the different "white balance presets", which are essentially choices for the assumed chromaticity of the scene illumination. Each entry has four numbers, which are in effect the "white balance scaling factors" for the sensor channels (the sensor of course really has four channels, R, G1, G2, and B, because each Bayer cluster has one R, one B, and two G photodetectors). Of course the two G values are always (almost) identical.

What are these values to be used for? Well, in an external raw development program, we might want to use a naïve process for chromatic adaptation correction. We can just multiply the sensor outputs by the respective "white balance scaling factors" from the table for the illumination we assume was in effect for the shot, and then make a transformation from the sensor outputs to r, g, and b (using a matrix that assumes that for the color of the white point of our color space, the sensor outputs will be equal).

Going back to the transformation matrix "per ISO 17321", if we feel the need to think in terms of "correction for the differing sensitivities of the sensor channels" (again, a notion not ever mentioned in ISO 17312) we can think of that as being built into the matrix.

Best regards,

Doug
 

Doug Kerr

Well-known member
It is interesting to compare the DxOMark report for the Canon EOS 40D with regard to the "relative sensor channel sensitivities" with the tables in the Exif metadata in files from that camera.

Of course, the camera does not have white balance presets for "CIE illuminant D50" or "CIE illuminant A'. But I took the preset that can be customized in terms of "color temperature" and set that to 5000 K (essentially the correlated color temperature for illuminant D50). Of course, there might be a small discrepancy with regard to the Planckian offset (the other "parameter" of an illuminant's chromaticity) but I just didn't worry about that.

Having done that, I fired a shot and examined the tables in the Exif metadata.

For that preset, the four constants were 2076, 1024, 1024, 1455 (for R, G1, G2, and B respectively). As is usual (but not universal), these were normalized so that both G values were 1024 (this is evidently an 11-bit value, with possible range 0-2047, so that is essentially "mid-range").

If we normalize these to the G value as 1.00, then the three values (I will give only one for G) are 2.03, 1.00 (of course), and 1.51.

In the DxOMark report page for teh 40D, based on illuminant D50, the three values of the "White balance scales" are 2.07, 1.00, and 1.49. These values are of course very similar to those suggested by the camera Exif metadata.

Best regards,

Doug
 

Doug Kerr

Well-known member
As you know, I have been vexed by the significance of the three bar charts on the Color Response page of the DxOMark report on a camera sensor.

I had sent a blind inquiry to DxOMark asking for help in understanding this report. This morning I received a very kind reply from Sophie Cornillet-Jeannin of the DxOMark support team.

It was helpful, but did not completely resolve my concerns. Perhaps it even raised some new ones!

In the next stanza of this saga, I will discuss that response. But to prepare you for that, I will give a brief (!) tutorial on some matters of matrix algebra.

BACKGROUND

Notation

Rather than use abstract examples, I will use examples from this particular topic. However, I will use my own notation, and thus I will begin by describing it.

I use the symbols d, e, and f to represent the three "channel" outputs of our sensors. These are often called R, G, and B, but that can be misleading.

(These are called in the DxOMark report R raw, G raw, and B raw, which are certainly better than R, G, and B.)

I use the symbols r, g, and b to represent the linear coordinates of a color in the sRGB color space. Once those values have been placed on a nonlinear basis, they are designated R, G, and B (the coordinates actually stated for a color in the sRGB color space).

(These are called in the DxOMark report R sRGB, G sRGB, and B sRGB, which of course is nice in that it makes the color space involved clear, but nevertheless those suggest the nonlinear coordinates, which those values are not.)

Our non-colorimetric sensors

The three output values, d,, e, and f do not correspond to r,,g, and b. In fact, they are not the coordinates of a color in any color space. The reason is that our sensors are non-colorimetric; that is, a set of three output values does not correspond consistently to a color, and all metamers of a color do not necessarily produce a consistent set of senor output values.

In an actual camera, if we consider the "simplest" scheme, we transform the set of three sensor outputs, d, e, f, to a set of linear sRGB coordinates by multiplying d,e,f by a 3 × 3 matrix. Because of the considerations I just discussed, this is an "imperfect" operation. But we choose the matrix to produce a result that is as free of error as is possible. That means in particular that the average color error, averaged over tests with a prescribed set of color patches of accurately-defined reflective color, under some particular illuminant, will be as small as possible.

ISO standard 17321

ISO standard 17321 provides for the determination of what I will call the "minimum attainable average error" in color representation for a given sensor (considering operation under a certain illuminant). As part of that process, the standard tells how to construct the "optimum matrix" for transforming d, e, f to r, g, b. Actually, it works from d,e,f, to X,Y,, the coordinates of the CIE XYZ color space, and then to L*a*b*, the coordinates of the CIE L*a*b* color space. But since these color spaces are related by firm mathematical relationships, we could take a matrix developed under the ISO 17321 procedure and transform it into a matrix that would work (such as it is) from d, e, f to r, g, b.

MATRIX OPERATIONS

Our transformation matrix

The transformation of a set of the three outputs of a sensor, d, e, and f, to the best set of the linear sRGB coordinates, r, g, and b is done by multiplying the "vector" (a matrix with only one row or one column) containing d, e, f by the transformation matrix, which I will call T. The formal way to show that operation is this:

Matrix-50a.gif

Note that I use the "multiply" sign to represent the multiplication that is involved; in actual formal work this is not used, the adjacency of the two matrixes implying multiplication.

I have shown that the transform matrix, T, is composed of nine coefficients, designated T11 through T33.

But in engineering work, a shorthand is often used to represent this setup:

Matrix-50b.gif

This is actually very helpful to assist in understanding what the matrix does, and in fact how to actually calculate the result in a particular case. It treats the matrix like a network with three inputs and three outputs, a notion much beloved to engineers (electrical ones in particular).

We can read this presentation this way: input d is multiplied by T11 to become a component of output r; input e is multiplied by T23 to become a component of output b.

If we gather up all the components of output g, we find that it is calculated this way:

b = d•T12 + e•T22 + f•T32​

The inverse of our matrix

If matrix T can be used to transform d, e, f to r, g, b, then the inverse of that same matrix will transform r, g, b to d, e, f. (Don't worry now about why we might want to do that.) To determine from a matrix its inverse is not, as we might hope, a trivial operation, but is mathematically rather complicated., Fortunately, my Excel spreadsheet will do the work for me in any specific case.

If we use the formal presentation for the use of the inverse of our matrix T (and I will call its inverse T ') to transform a set of r, g, b to a set of d, e, and f, that would look like this:

Matrix-50c.gif

Since I call this matrix T ', I have labeled all its nine coefficients with T ' designations.

Using the "engineering" notation, that would look like this:

Matrix-50d.gif

Among other things, this helps us figure out the detailed mathematics needed to calculate the result in a specific case. It works just like in the example above,

Now I have to poop the party a little bit. We have heard that the transformation from d, e, f to r, g, b is not "rigorous", and our matrix does it in a way that results in the least average error over a collection of colors. That having been said, if we take some set of r, g, b and transform it, using the inverse matrix, to d, e, and f, just what does that set of values represent?

Well, it would be the one set of d, e, f that our original matrix would transform to that particular set of r, g, b. It may not be clear what the significance of that is. Hold that thought.

Now, I hope you are prepared to read the next stanza of this saga. Now I have to get prepared to write it!

Best regards,

Doug
 

Doug Kerr

Well-known member
Yes, I know this is beginning to sound like, "Eat all your vegetables before you get your dessert." And the vegetables just keep coming.

************

Much of our attention involves the infamous "ISO 17321" matrix that appears on the DxO Color Response report page.

Recall this is a matrix that we might want to use to transform the sensor outputs (e, e, and f in my notation) to the linear coordinates of an sRGB color representation (r, g, and b in my notation).

I say "might want to use" because, given that our sensors are non-colorimetric, there is no unique "proper" transformation matrix.

Rather, the matrix we see here is one that is developed using a procedure in ISO 17321. It is intended to be an "optimal" matrix in a very specific way: used to transform d, e, f to r,g, b for 18 standardized color patches illuminated by a certain standard illuminant, it gives the smallest average color error (averaged over those 18 patches). It is not "recommended" for use in actual cameras. It is just used in a "virtual camera" used in the ISO procedure to determine the sensor's potential for color accuracy. But we might want to actually use it in a camera.

That's a pretty esoteric notion. What I want to do in this stanza of the DxOMark report saga is to give a better intuitive understanding for this matrix.

To do that, I will first assume a colorimetric sensor; that is, one whose outputs are the same for a given color regardless of what spectrum it has. Having done that, it turns out that there is a "perfect" matrix for transforming d, e, f to r, g, b. We need not think in terms of one that minimizes the overall average error for a certain set of "targets".

I will further at first assume an even happier situation: the outputs of the sensor are in fact the r, g, b coordinates for the color involved. Said another way, the d photodetectors respond to the r-ness of the color of the light, and not at all to its g-ness or b-ness.

By, for example, "r-ness" I mean that aspect of the color of the light that is properly represented by the r coordinate of its r, g, b representation.​

Mathematically, that behavior is:

r = d, g= e, b = f​

To perform that trivial computation with a matrix, the matrix would look like this (again I use the "engineering" labeling):

Matrix_unity-01.jpg

Again this means that to get r, we take the values of d, e, and f and multiply each by the matrix coefficient in its row and in the "r" column, and sum all those products to get r.

Of course, in this case d gets multiplied by 1 and e and f get multiplied by zero, so the result for r is just d.

Now lets think of another sensor. Its e photodetectors respond only to the g-ness of the color of the light, and its f photodetectors respond only to the b-ness of the color of the light.

But the d photodetectors respond appreciably to the r-ness of the light, but also a little bit to its g-ness and its b-ness. To compensate for that when we transform d, e, f to r,,g, b, we might use a matrix like this:

Matrix_r-impure-01.jpg

What happens here is this. To get r, we start out by taking d (and multiplying it by 1). But we know that if there is any g-ness or b-ness in the color, the value of d would have been inflated by the D photodetectors' undesirable response to g-ness and b-ness.

So we discount the d value by a fraction of the e and f values to (negate) the effect of those components of the light's color on the d value from the D phtodetector. That is of course done by the two negative coefficients in the r column.

Hold that thought.

Best regards,

Doug
 

Doug Kerr

Well-known member
As I mentioned when starting this "chapter", I had sent a blind inquiry to DxOMark asking for help in understanding this report, especially the three bar charts.

Yesterday I received a very kind reply from Sophie Cornillet-Jeannin of the DxOMark support team. She said as follows (I have separated the aspects of her reply for ease of later reference:

A. In fact we do not expose the sensor with sRGB pure primaries.

B. These values are computed as you noticed with the inverse of the color matrix that is given in the same page.

C. What we do is that we try to show with this graph the amount of each channel we need to use to produce one sRGB primary.

D. The idea is to detect sensors that would have a large overlap on the spectral sensitivity point of view.

Please have a look to this article for more information:

http://www.dxomark.com/Reviews/Canon-500D-T1i-vs.-Nikon-D5000/Color-blindness-sensor-quality

Items A and B of her reply comport perfectly with my earlier observations and thoughts.

And D makes sense, in a qualitative way.

But I have some problems with item C.

Firstly, it is clearly inappropriate to refer to "to produce one sRGB primary". Of course an sRGB primary is a kind of light, not a value, for example, of a color space coordinate.

Almost certainly what she means is:

What we do is that we try to show with this graph the amount of each channel we need to use to produce one unit of an sRGB linear coordinate.​

Now, in all fairness to the author, one of the three sRGB coordinates (r, g, and b in my notation) tells us the amount of an sRGB primary to include in the "mix" that will produce the color being described. Thus is it understandable (but inaccurate) to think of the linear sRGB coordinates as being "sRGB primaries".

But now to what almost certainty is Sophie's point in Item C. If we had in hand the matrix that is considered as being used to transform d, e, f to r, g, b, what tells us "the amount of each sensor output that is to be included in the calculation of each sRGB coordinate"?

Well, for example, in calculating the value of r, one component is obtained by taking d and multiplying it by the coordinate in the upper-left cell of the matrix.

In the language of interest, the "amount of" the d sensor output that is included in the value of coordinate r is given by that matrix coordinate.

But the values of the nine bars on the three bar charts are not the coefficients of "the matrix" but rather are the coefficients of its inverse.

Now, what is the significance of that inverse matrix? Well it allows us to calculate, for any given set of r, g, b values, what set of d, e, f values would give that set of r, g, b values after transformation by the matrix.

But, if we put aside the fact that our sensor is non-colorimetric, we could also think in the following terms:

If we know the actual color of a scene object, in terms of r, g, and b, then we can transform that set of r, g. b values into the d, e, f values (the sensor outputs) that the sensor would produce from that color. Perhaps this is what the set of bar charts is supposed to tell us.

Staying with that simplistic view of sensor behavior for a moment, let's see if we can relate that to the three bar charts.

Remember, the inverse matrix works like this (using the engineering notation) (this is the inverse of the D50 matrix for the EOS 40D):

Matrix_EOS-40D-D50_inverse.jpg

Because the coefficient in the upper left corner tells us how much of the r-ness of the light color shows up in the d sensor output, I can speak of that coefficient as the "r->d" ("r leads to d") coefficient.

Now that I have established that notation, let's look again at the three bar charts for the D50 report for the EOS 40D:

DxOMark_Canon_40D_D50-03-A2.jpg

You will see that I have added something new: notation of the form "r->d" on each of the bars. How do I know which coefficient of the inverse matrix is shown by each bar? By comparing the values of the bars with the coefficients of the matrix!

The most reasonable interpretation of this is that, for example, the second bar on the first chart ("r->e") tells us how the r-ness of the light color influences the value of the E ("Green") detector output, e.

Yet that bar is not the "r" bar on the E channel ("Green") chart; it is the "g" bar on the D channel ("Red") chart.

So I am still rather baffled as to just what is going on here.

I shall ponder further.

Best regards,

Doug
 

Doug Kerr

Well-known member
I begin this next stanza by speaking of the "ISO 17321" matrix (which we might use to transform a set of sensor outputs, d, e, f, "in the best way possible" to a set of sRGB linear coordinates, r, g, b). I present it here labeled "in the engineering fashion", using my notation for the three "inputs" (d,e, f) and the three "outputs (r, g, b).

I will denote the nine coefficients in a way that reminds us of their role in the composition of reach of the three outputs.

Matrix_T.jpg

The matrix is designated T (for "transform). Coefficient Tdr is the one that takes input d and makes it into an ingredient of output r.

Here we see the transform of matrix T, designated T'"

Matrix_T'.jpg

The designations of its coefficients follows the same plan as for matrix T.

The discussion of the "three bar charts" in the DxO background material suggests that they portray the response of the sensor with regard to what is rather naïvely called the "purity" of the three channels.

Essentially, the channel usually called "R" (I call it "D") is said to be wholly "pure" if it responds only to what I call the r-ness of the color of the light. As a consequence, we can consider its output (which I call d) to directly give us what I call r, one of the three linear sRGB coordinates of the color of the light.

But, the description does on to say, in reality the three sensor channels are not "pure". Rather, for example, the "R" channel responds, in differing degree, to the r-ness, g-ness, and b-ness of the color of the light.

Now this nice story essentially assumes that the sensor is colorimetric, which in real life it is not, but we will follow along anyway.​

Now, let's look at one of the sets of bar charts in a typical DxO report. These are for the Canon EOS 500D, for illuminant D50.

Canon_500D_bar_charts_D50.jpg


The DxO description intimates that, if the R sensor channel, for example, were "totally pure", then on the bar chart for that channel (the leftmost one), the first bar ("R sRGB") would have some substantial value and the other two would be zero.

Then, the description goes on, we treat the values of the nine bars as the coefficients of a matrix. We then take the inverse of that matrix, and this is the matrix we should use to transform a set of sensor outputs (d, e, f in my notation) to a set of sRGB linear coordinates (r, g, b). And that is the matrix shown on the DxO report page (the "ISO 1`7321 matrix").

But, going along with this scenario, how should we make the nine bar chart values into the coefficients of that first matrix. Do the three values on one chart make up a column or a row?

Well, we can get at that by looking at the two matrixes above. If the "ISO 17321" matrix is the inverse of the "channel decomposition matrix" (to use DxO's term), then the channel decomposition matrix is the inverse of the ISO 17321 matrix. Thus we can think of the channel decomposition matrix (made up of the values of the nine bars) as the "T'" matrix.

Now let's go back to the discussion of "channel purity" as reflected by the values of the bars on a given sensor channel chart. On the "red" chart (my D), the three bars supposedly tell us the influence on the d value of the r-ness, g-ness, and b-ness of the color, respectively. Thus, the coefficients of the matrix that come from the three bars on the "red" (d) chart must be T'rd, T'gd, and T'bd, in that order. Similarly, the coefficients of the matrix that come from the three bars on the "green" (e) chart must be T're, T'ge, and T'be, in that order. That is, the three bar values on one chart make up a column of matrix T'.

But if we take a given DxO report page, consider the ISO 17321 matrix (matrix T), with its stated coefficients, take its inverse (giving us the corresponding matrix T'), and compare its coefficients with the values of the bars on the three bar charts, we find that the values of the three bars on any given chart are consistent with the coefficients in a row of matrix T'.

Very curious!

Now we "know" that, despite the nice little story in the DxO explanation of the bar charts, the process does not really work this way:

• DxO determines the response of each of the sensor channels to excitation by light whose colors correspond to the sRGB primaries. Those are presented on the three three-bar charts.

• The nine values are used as the coefficients of a matrix.

• We take the inverse of that matrix.

• This is the matrix presented on the report page (referencing ISO 17321).

Rather, almost certainly, it actually works this way:

• The SMI (sensitivity metamerism index) for the sensor is determined using the procedure defined by ISO 17321 (which involves test shots made of 18 standard color patches, illuminated by a certain illuminant).

• In the course of this, an "optimal" matrix for the transformation of the sensor outputs to the coordinates of a standard color space is developed.

• That matrix, transformed so the color space in which it works is sRGB (giving the linear coordinates), is presented on the DxO report page.

• The transform of that matrix is calculated.

• Its coefficients are used as the values of the nine chart bars, but with rows and columns erroneously interchanged.

Woof!

So interpretation of the significance of the three bar charts is actually impossible.

That's how it looks here right now.

Best regards,

Doug
 

Doug Kerr

Well-known member
The article in the DxO supporting material archive to which Sophie referred in her note had a section relating to the concept of the "purity" of the sensor channels. It includes these figures, which show the spectral response of the three sensor channels in the Canon 500D (left) and Nikon D500 (right):

image018-X2.jpg
image019-X2.jpg

Then attention is given to the "bar charts" of the reports for those two cameras (D50 basis):

Canon_500D_bar_charts_D50.jpg


Canon EOS 500D (illuminant D50)

Nikon_D5000_bar_charts_D50.jpg


Nikon D5000 (illuminant D50)​

The article points out that the spectral response curves for the Canon 500D reveal that the "red" sensor channel has a substantial response in the part of the spectrum we can think of as "green", while for the Nikon D5000 the "red" sensor channel has much less response in the part of the spectrum we can think of as "green". (They cite a wavelength of 550 nm as being "the heart of greenness" - my term).

They then relate that to the bar charts, saying in effect that, on the chart for the "Sensor Red Channel" the three bars represents its response to what they describe as the three sRGB primaries (even though Sophie confirms that they do not get these charts by testing the sensors with the three sRGB primaries).

So looking at the leftmost chart for the Canon 500D, we see in fact that the sensor had greater response to the "G" sRGB primary than to the "R" sRGB primary.

Now this is very peculiar - after all, we saw that the spectral response of the "R" sensor channel is higher than would be ideal in the "green" region, but not higher than in the "red" region.

But remember, these bar charts are not derived from the results of measuring the response of the sensor channels to the three sRGB primaries. They are derived by plotting, as bars, the coefficients of the inverse of the "ISO 17321" matrix for this sensor (under the assumed illuminant). That matrix is derived by a complex process such that if we were to use the matrix to transform the sensor outputs to the linear coordinates of the SRGB color space, the average color error (over 18 standard test patches) would be the minimum achievable.

But there is considerable evidence that, in using the coordinates of that inverse matrix to determine the values of the nine bars, the row and column axes of the matrix have been interchanged!

If that is indeed so, and if we were to redraw the chart with the coefficients "going to the bars they should", we would get this set of bar charts:

Canon_500D_bar_charts_D50_reversed.jpg


Canon 500D (illuminant D50) - matrix row/column orientation reversed​

This seems to me to make a much better intuitive fit to the implications of the spectral repose curves.

I must admit that this does not seem to exactly work out for the "green" channel. But again, we must remember that the bars do not come from any direct analysis of the spectral response of the sensors (although such is in fact one of the premises of the complicated ISO 17321 procedure).

So I would not expect to be able, "at sight", to accurately determine what the coefficients of the inverse of the ISO 17321 matrix would be!

An interesting exercise!

Best regards,

Doug
 

Doug Kerr

Well-known member
Now, if in fact my revised presentation of the coefficients of the inverse of the ISO 17321 matrix onto a set of bar charts is "correct", how can we interpret those bar charts?

Well, we need to recognize that:

• given that our sensors are not colorimetric, and

• given that the ISO 17321 procedure for dealing with that is very complicated

then there is no easy "exact" interpretation of the bar charts.

But if we are willing to accept the qualifications of "conceptually", "approximately", and "broadly", I believe that we can interpret the bar charts (in my revised form) this way:

On the panel for one "channel" of the sensor, the three bars show approximately its relative response to the r-ness, g-ness, and b-ness of the color on the sensor, where by "b-ness" (for example) I mean the attribute of the color that would be reflected in the b linear coordinate in the sRGB color space.

But isn't that what the DxO writings essentially say for the bar charts in the form they use? Well, they seem to, although not as clearly as would be desirable.

So, are their bar charts just wrong? Well, it looks as if perhaps so.

How could this happen? After all, DxOMark is a very sophisticated organization.

Well, I have to say it is easy to get confused about just what the coefficients of a matrix mean. I had a terrible time with that when I, fairly recently, began looking into various color space transformations done by matrix multiplication. I finally got it sorted out (thanks to Bruce Lindbloom for helping me with that), but I have a number of "cheat sheets" I need to refer to from time to time.

Another reality may be the gap between the mathematicians and the people who design Web pages. Often the handoff from one camp to the other is too casual: "Then you just take the coefficients of that inverse matrix and use them as the heights of the nine bars on the charts, ya got that? Well, I gotta go back to work now."

Well, I gotta go back to work now. Or maybe take a nap.

Yes, I will try and sort DxOMark out on this.

Best regards,

Doug
 
Top