-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving variable to aesthetic mapping (input asked) #406
Comments
Hi @mtennekes, |
Tangentially, there is infrastructure in classInt to handle interval closure ( In addition, @dieghernan has contributed a new style: |
Thanks @tim-salabim and @rsbivand. Currently, tmap also treats integers as numeric and character as factors, but since there were a few use cases in which the data values are clearly integers, it would be good to adjust the breaks (or at least the labels) accordingly. The interval closure is not my main concern. It is under control: the argument |
Hi @mtennekes, For my use cases the new legend labels for integers are really helpful. I would prefer a additional option "as.integer" with a default value determined by the class of the variable (integer or numeric). I think a named color vector would be fine for factors and numeric (or integer) variables as well. A unified approach to define a palette would be more user friendly, but I don't know if this would be too complicated for floating point numbers. |
Hi @mtennekes about the integer legend: 10 years ago I would have thought "great!", now I think it is over-engineering. Does For the color ramps: |
Color assignment is working now. Also the colors from stars are used (I check whether there are duplicated levels and if so, apply droplevels). library(tmap)
library(stars)
#> Loading required package: abind
#> Loading required package: sf
#> Linking to GEOS 3.8.0, GDAL 2.4.2, PROJ 5.2.0
data(World)
# palette of named colors for a character/factor variable
tm_shape(World) + tm_polygons("income_grp",
palette = c("2. High income: nonOECD" = "red",
"3. Upper middle income" = "green",
"4. Lower middle income" = "pink",
"1. High income: OECD" = "blue",
"5. Low income" = "purple")) # palette of named colors for a numeric variable
World$income_grp_int <- as.integer(World$income_grp)
tm_shape(World) + tm_polygons("income_grp_int", style = "cat",
palette = c("2" = "red",
"3" = "green",
"4" = "pink",
"1" = "blue",
"5" = "purple"))
# use the colors of a stars object
#getwd()
r = read_stars("pr_landcover_wimperv_10-28-08_se5.img",
RAT = "Land Cover Class", proxy = TRUE)
# downloaded from https://s3-us-west-2.amazonaws.com/mrlc/PR_landcover_wimperv_10-28-08_se5.zip
qtm(r) + tm_legend(outside = TRUE) |
@mtennekes, thank you for opening this discussion. 1. Integer variables I think it would be a nice addition to tmap, but it is not crucial. 2. Specific value to color mapping This is, in my opinion, a way more interesting and important feature. It would be also great to make it possible to extend the color mapping to external symbologies (see https://github.com/mtennekes/tmap/issues/65 and r-spatial/discuss#36). Update: |
Good point @Nowosad ! Hmm, why isn't there an argument to specify whether unused levels are dropped (@mtennekes?) That specific file is crappy: I think it doesn't contain unused levels, but duplicated levels. Also the black-colored category has level |
You can find some examples with unused levels at r-spatial/stars#245 (comment). |
|
I agree @edzer, but I think there should be an argument in tmap invoking |
Exactly what I'm working on: an argument And I'll add an argument Thanks for your input! |
This is totally great! I provided a bit of code for reference
library(sf)
library(tmap)
library(dplyr)
counties <- read_sf("https://cdn.jsdelivr.net/npm/us-atlas@3/counties-10m.json") %>%
filter(stringr::str_sub(id,1,2) == "36")
n <- nrow(counties)
set.seed(100)
counties <- counties %>%
mutate(
vals_int = sample(1:10, n, replace = TRUE),
vals_cont = rnorm(n)
)
tm_shape(counties) +
tm_polygons("vals_int", style = "pretty")
tm_shape(counties) +
tm_polygons("vals_cont")
|
That's a very nice example @zross. It illustrates another problem: pretty(runif(100, min = 0, max = 10))
#> [1] 0 2 4 6 8 10
pretty(1L:10L)
#> [1] 0 2 4 6 8 10 When I opened this issue, I thought that changing the labels at the righthand-side of the intervals would be enough (e.g. from 0-10, 10-20 to 0-9, 10-19, etc). However, in this case it would make more sense to have 1-2, 3-4, 5-6, 7-8, 9-10 (given Any ideas how to tackle this problem? @rsbivand does |
No, |
data(World)
# as.count is TRUE for integers if style = pretty, fixed, or log10_pretty
# N (natural numbers, with 0)
World$x <- sample(0:20, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x") # N+ (natural numbers, positive)
World$x <- sample(1:20, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x") # Z (integers)
World$x <- sample(-10:10, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x")
#> Variable(s) "x" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette. # show as continuous (old way)
World$x <- sample(1:20, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x", as.count = FALSE) # style: fixed
tm_shape(World) + tm_polygons("x", breaks = c(1, 5, 10, 20)) # scientific notation (decided to use the set notation)
tm_shape(World) + tm_polygons("x", breaks = c(0, 1, 3, 5, 10, 20),
legend.format = list(scientific = TRUE)) # style: log10pretty (continuous)
tm_shape(World) + tm_polygons("pop_est", style = "log10_pretty") # style: log10pretty (count)
tm_shape(World) + tm_polygons("pop_est", as.count = TRUE, style = "log10_pretty") Created on 2020-04-07 by the reprex package (v0.3.0.9001) |
Thank you Martijn, both these enhancements are very helpful for me,
exactly as you are implementing them!
…On Sun, Apr 5, 2020 at 1:45 AM mtennekes ***@***.***> wrote:
tmap 3.0 will be released in a few days. For this version, I want to
improve the variable mapping, so any feedback/tips is welcome.
There is a need for two features:
*1. Integer variables*
Treat a numeric variable as integer. This is needed because currently the
legend labels will be 0 to 10, 10 to 20, 20 to 30, where the presumed
intervals are [0, 10), [10, 20) and [10, 30], so open righthand-side except
the last). When the variable is an integer, then the legend labels should
be 0 to 9, 10 to 19, 20 to 29 (or 30).
I'm thinking about style = "integer" or an additional argument as.integer.
The latter probably makes more sense since many break styles (current
options are c("cat", "fixed", "sd", "equal", "pretty", "quantile",
"kmeans", "hclust", "bclust", "fisher", "jenks", and "log10_pretty"))
should handle integers slightly differently. For instance, "log10_pretty"
will return 0 to 1, 1 to 10, 10 to 100 when the variable is continuous and
should return 0, 1 to 9, 10 to 99 when it is an integer.
What do you think? If we go for the second option, what would be a good
name for the argument? as.integer, as.continuous, as.discrete, ....?
Next question: should tmap set the default value to this argument to
continuous, or should the default value be determined by whether all
variable values are integers?
(see also #258 <https://github.com/mtennekes/tmap/issues/258> and #399
<https://github.com/mtennekes/tmap/issues/399>)
*2. Specific value to color mapping*
Sometimes all a user (including myself) wants is to map specific data
variables to specific colors.
How should this be done? Keep in mind that it should work for integer and
categorical data.
For categorical data, we could let the user assign a named color vector to
the argument palette, where the names correspond to the levels.
How do we do this for numeric data? A color table? If so, it makes sense
to add the labels in this color table as well, rather than via the labels
argument. Any ideas?
(see also r-spatial/mapview#208
<r-spatial/mapview#208>)
@Nowosad <https://github.com/Nowosad> @Robinlovelace
<https://github.com/Robinlovelace> @sjewo <https://github.com/sjewo>
@jannes-m <https://github.com/jannes-m> @tim-salabim
<https://github.com/tim-salabim> @edzer <https://github.com/edzer>
@rsbivand <https://github.com/rsbivand> @mcSamuelDataSci
<https://github.com/mcSamuelDataSci> @zev <https://github.com/zev> @zross
<https://github.com/zross>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/mtennekes/tmap/issues/406>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYFE6BJYNQ72KUL7ZXVCODRLBAKHANCNFSM4MANICWA>
.
|
Wonderful!!
…On Tue, Apr 7, 2020 at 11:49 AM mtennekes ***@***.***> wrote:
data(World)
# as.count is TRUE for integers if style = pretty, fixed, or log10_pretty
# N (natural numbers, with 0)World$x <- sample(0:20, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x")
<https://camo.githubusercontent.com/72e3f79059ea5be1d2200883318f0706af2f03ac/68747470733a2f2f692e696d6775722e636f6d2f55615a634b6d722e706e67>
# N+ (natural numbers, positive)World$x <- sample(1:20, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x")
<https://camo.githubusercontent.com/246beb8516e3f86e25caf3093366338e7f98deed/68747470733a2f2f692e696d6775722e636f6d2f323956623651512e706e67>
# Z (integers)World$x <- sample(-10:10, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x")#> Variable(s) "x" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
<https://camo.githubusercontent.com/fc5c89ae114db075db7513f92e6801551df9e7f7/68747470733a2f2f692e696d6775722e636f6d2f36536c5670694a2e706e67>
# show as continuous (old way)World$x <- sample(1:20, size = 177, replace = TRUE)
tm_shape(World) + tm_polygons("x", as.count = FALSE)
<https://camo.githubusercontent.com/c49e3b5ef266a682e4a9aae3cdb95942f1e820d9/68747470733a2f2f692e696d6775722e636f6d2f4c4d39696b346c2e706e67>
# style: fixed
tm_shape(World) + tm_polygons("x", breaks = c(1, 5, 10, 20))
<https://camo.githubusercontent.com/47c28448c85459952cc57b60f30b6e38652635d6/68747470733a2f2f692e696d6775722e636f6d2f3841435a7464712e706e67>
# scientific notation (decided to use the set notation)
tm_shape(World) + tm_polygons("x", breaks = c(0, 1, 3, 5, 10, 20),
legend.format = list(scientific = TRUE))
<https://camo.githubusercontent.com/6e06e35c2f02dd8796b503da181195a9904150e8/68747470733a2f2f692e696d6775722e636f6d2f437436614331582e706e67>
# style: log10pretty (continuous)
tm_shape(World) + tm_polygons("pop_est", style = "log10_pretty")
<https://camo.githubusercontent.com/58c43db3a1faaf9d16a479c136f09046db87b843/68747470733a2f2f692e696d6775722e636f6d2f727042747162692e706e67>
# style: log10pretty (count)
tm_shape(World) + tm_polygons("pop_est", as.count = TRUE, style = "log10_pretty")
<https://camo.githubusercontent.com/ae4d39f43001976845b1dae6391d904bee1ef59f/68747470733a2f2f692e696d6775722e636f6d2f51506765654e522e706e67>
Created on 2020-04-07 by the reprex package <https://reprex.tidyverse.org>
(v0.3.0.9001)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/mtennekes/tmap/issues/406#issuecomment-610559092>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYFE6CJHJYLLIFFSXSLZO3RLNYU7ANCNFSM4MANICWA>
.
|
Re: https://github.com/mtennekes/tmap/issues/406#issuecomment-609428252 classInt 0.4-3 with |
... and already supported by tmap data(World)
tm_shape(World) + tm_symbols(col = "pop_est_dens",
style = "headtails", style.args = list(thr = 1)) |
tmap 3.0 on its way to CRAN |
tmap 3.0 will be released in a few days. For this version, I want to improve the variable mapping, so any feedback/tips is welcome.
There is a need for two features:
1. Integer variables
Treat a numeric variable as integer. This is needed because currently the legend labels will be 0 to 10, 10 to 20, 20 to 30, where the presumed intervals are [0, 10), [10, 20) and [10, 30], so open righthand-side except the last). When the variable is an integer, then the legend labels should be 0 to 9, 10 to 19, 20 to 29 (or 30).
I'm thinking about
style = "integer"
or an additional argumentas.integer
. The latter probably makes more sense since many break styles (current options arec("cat", "fixed", "sd", "equal", "pretty", "quantile", "kmeans", "hclust", "bclust", "fisher", "jenks", and "log10_pretty")
) should handle integers slightly differently. For instance,"log10_pretty"
will return 0 to 1, 1 to 10, 10 to 100 when the variable is continuous and should return 0, 1 to 9, 10 to 99 when it is an integer.What do you think? If we go for the second option, what would be a good name for the argument?
as.integer
,as.continuous
,as.discrete
, ....?Next question: should tmap set the default value to this argument to continuous, or should the default value be determined by whether all variable values are integers?
(see also https://github.com/mtennekes/tmap/issues/258 and https://github.com/mtennekes/tmap/issues/399)
2. Specific value to color mapping
Sometimes all a user (including myself) wants is to map specific data variables to specific colors.
How should this be done? Keep in mind that it should work for integer and categorical data.
For categorical data, we could let the user assign a named color vector to the argument
palette
, where the names correspond to the levels.How do we do this for numeric data? A color table? If so, it makes sense to add the labels in this color table as well, rather than via the
labels
argument. Any ideas?(see also r-spatial/mapview#208)
@Nowosad @Robinlovelace @sjewo @jannes-m @tim-salabim @edzer @rsbivand @mcSamuelDataSci @zross
The text was updated successfully, but these errors were encountered: