mosaicplot() labels overlap (PR#8536)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

mosaicplot() labels overlap (PR#8536)

Greg Kochanski-4
Full_Name: Greg Kochanski
Version: 2.2.1
OS: Debian Linux (testing)
Submission from: (NULL) (212.159.16.190)


This is really a feature request.

When you do mosaicplot() on a data set where the probability of
several nearby rows is small, then the labels for those
rows are plotted overlapping each other.

This situation can be improved by calling mosaicplot()
with a large value of "off", but sometimes, even off=50
(the largest allowable value) isn't sufficient,
especially if the labels are several characters long.

The problem exists even if the labels don't overlap,
because one needs space between the labels to avoid
confusion.   For instance, labels "L*H", "!H*", and
"L%" when too close together turn into
"L*H!H*L%" which is confusing to anyone.

The problem could be solved by breaking the assumption that
the label position need always be exactly matched to the
graphic.    This is OK, especially for rows because
(a) the graphical blocks that are part of a single row
aren't aligned with each other anyway, and
(b) if you can read the labels, you can generally
match things up by counting.

One way to do this in a fairly nice way is to position
the labels in such a way to minimize the
sum of the squared error between the label center
and the average position of the blocks on that row,
subject to the constraint that labels be
non-overlapping.

This problem is actually not too hard to solve:
it is essentially Kruskal's algorithm for finding
a best-fit monotonic sequence  (which probably exists in
CRAN already).

Neglecting edge effects, assume you have a
vector of desired positions z, and
a vector of minimum widths for each label w.
Then, you can compute the space used up by
the labels:  s[i] = -0.5*w[1] + sum(j<i of w[i]) + 0.5*w[i]
and compute y = M(z-s) + s
where M() gives the best-fit monotonically nondecreasing
fit to it's argument.   Y should then be the correct
place to put each label.

If there's a likelyhood of getting a patch accepted,
I could probably supply one.

(Given the opportunity, I'd think about shifting the blocks
up and down also, to do an overall alignment.)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: mosaicplot() labels overlap (PR#8536)

Achim Zeileis
On Sun, 29 Jan 2006 [hidden email] wrote:

> Full_Name: Greg Kochanski
> Version: 2.2.1
> OS: Debian Linux (testing)
> Submission from: (NULL) (212.159.16.190)
>
>
> This is really a feature request.

Hence not a bug (just for the record).

A potential solution to your problem is to write your own labeling
function for strucplot() in vcd implementing the approach you suggest
below. Also check the available labeling functions whether they can be
used to produce acceptable results. One way which might work, depending on
your specific data, is to rotate the labels. Details can again be found in
the package vignettes.
Z

> When you do mosaicplot() on a data set where the probability of
> several nearby rows is small, then the labels for those
> rows are plotted overlapping each other.
>
> This situation can be improved by calling mosaicplot()
> with a large value of "off", but sometimes, even off=50
> (the largest allowable value) isn't sufficient,
> especially if the labels are several characters long.
>
> The problem exists even if the labels don't overlap,
> because one needs space between the labels to avoid
> confusion.   For instance, labels "L*H", "!H*", and
> "L%" when too close together turn into
> "L*H!H*L%" which is confusing to anyone.
>
> The problem could be solved by breaking the assumption that
> the label position need always be exactly matched to the
> graphic.    This is OK, especially for rows because
> (a) the graphical blocks that are part of a single row
> aren't aligned with each other anyway, and
> (b) if you can read the labels, you can generally
> match things up by counting.
>
> One way to do this in a fairly nice way is to position
> the labels in such a way to minimize the
> sum of the squared error between the label center
> and the average position of the blocks on that row,
> subject to the constraint that labels be
> non-overlapping.
>
> This problem is actually not too hard to solve:
> it is essentially Kruskal's algorithm for finding
> a best-fit monotonic sequence  (which probably exists in
> CRAN already).
>
> Neglecting edge effects, assume you have a
> vector of desired positions z, and
> a vector of minimum widths for each label w.
> Then, you can compute the space used up by
> the labels:  s[i] = -0.5*w[1] + sum(j<i of w[i]) + 0.5*w[i]
> and compute y = M(z-s) + s
> where M() gives the best-fit monotonically nondecreasing
> fit to it's argument.   Y should then be the correct
> place to put each label.
>
> If there's a likelyhood of getting a patch accepted,
> I could probably supply one.
>
> (Given the opportunity, I'd think about shifting the blocks
> up and down also, to do an overall alignment.)
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel