optimization challenge

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

optimization challenge

Greg Snow-2
I have a challenge that I want to share with the group.

This is not homework (but I may assign it as such if I teach the appropriate class again) and I have found one solution, so don't need anything urgent.  This is more for fun to see if others can find a better solution than I did.

The challenge:

I want to read a book in a given number of days.  I want to read an integer number of chapters each day (there are more chapters than days), no stopping part way through a chapter, and at least 1 chapter each day.  The chapters are very non uniform in length (some very short, a few very long, many in between) so I would like to come up with a reading schedule that minimizes the variance of the length of the days readings (read multiple short chapters on the same day, long chapters are the only one read that day).  I also want to read through the book in order (no skipping ahead to combine short chapters that are not naturally next to each other.

My thought was that the optim function with method="SANN" would be an appropriate approach, but my first couple of tries did not give very good results.  I have since come up with an optim with SANN solution that gives what I consider good results (but I accept that better is possible).

Below is a data frame with the lengths of the chapters for the book that originally sparked the challenge for me (but the general idea should work for any book).  Each row represents a chapter (in order) with 3 different measures of the length of the chapter.

For this challenge I want to read the book in 128 days (there are 239 chapters).

I will post my solutions in a few days, but I want to wait so that my direction does not influence people from trying other approaches (if there is something better than optim, that is fine).

Good luck for anyone interested in the challenge,

The data frame:

bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi 1",
"1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
"1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
"1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
"1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
"1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
"2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
"2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
"2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
"2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
"2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
"2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
"Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
"Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
"Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah 7",
"Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
"Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
"Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
"Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
"Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
"Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
"Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
"Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
"Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
"Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
"Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
"Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
"Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
"Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
"Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
"Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
"Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
"Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
"3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
"3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
"3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
"3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
"3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
"3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
"4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
"Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether 1",
"Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
"Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
"Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni 4",
"Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni 10"
), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters = c(3746L,
3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
"Verses"), row.names = c(NA, -239L), class = "data.frame")


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

jones-2
Greg

Nice problem: I wasted my whole day on it :-)

I was explaining my plan for a solution to a colleague who is a
computer scientist, he pointed out that I was trying to re-invent the
wheel known as dynamic programming.  here is my code, apparently it is
called "bottom up dynamic programming".  It runs pretty quickly, and
returns (what I hope is :-) the optimal sum of squares and the
cut-points.

function(X=bom3$Verses,days=128){
# find optimal BOM reading schedule for Greg Snow
# minimize variance of quantity to read per day over 128 days
#
N = length(X)
Nm1 = N-1
SSQ<- matrix(NA,nrow=days,ncol=N)
Cuts <- list()
#
#  SSQ[i,j]: the ssqs about the overall mean for the optimal partition
#           for i days on the chapters 1 to j
#
M = sum(X)/days
CS = cumsum(X)
SSQ[1,]= (CS-M)^2
Cuts[[1]]= as.list(1:N)
#
for(m in 2:days){
Cuts[[m]]=list()
#for(i in 1:(m-1)) Cuts[[m]][[i]] = Cuts[[m-1]][[i]]
for(n in m:N){
           CS = cumsum(X[n:1])[n:1]
             SSQ1 = (CS-M)^2
             j = (m-1):(n-1)
             TS = SSQ[m-1,j]+(SSQ1[j+1])
                  SSQ[m,n] = min(TS)
                  k = min(which((min(TS)== TS)))+m-1
                  Cuts[[m]][[n]] = c(Cuts[[m-1]][[k-1]],n)
}
}
list(SSQ=SSQ[days,N],Cuts=Cuts[[days]][[N]])
}

$SSQ
[1] 11241.05

$Cuts
  [1]   2   4   7   9  11  13  15  16  17  19  21  23  25  27  30  31  34  37
 [19]  39  41  44  46  48  50  53  56  59  60  62  64  66  68  70  73  75  77
 [37]  78  80  82  84  86  88  89  91  92  94  95  96  97  99 100 103 105 106
 [55] 108 110 112 113 115 117 119 121 124 125 126 127 129 131 132 135 137 138
 [73] 140 141 142 144 145 146 148 150 151 152 154 156 157 160 162 163 164 166
 [91] 167 169 171 173 175 177 179 181 183 185 186 188 190 192 193 194 196 199
[109] 201 204 205 207 209 211 213 214 215 217 220 222 223 225 226 228 234 236
[127] 238 239




On Tue, Jan 12, 2010 at 11:33:36AM -0700, Greg Snow wrote:

> I have a challenge that I want to share with the group.
>
> This is not homework (but I may assign it as such if I teach the appropriate class again) and I have found one solution, so don't need anything urgent.  This is more for fun to see if others can find a better solution than I did.
>
> The challenge:
>
> I want to read a book in a given number of days.  I want to read an integer number of chapters each day (there are more chapters than days), no stopping part way through a chapter, and at least 1 chapter each day.  The chapters are very non uniform in length (some very short, a few very long, many in between) so I would like to come up with a reading schedule that minimizes the variance of the length of the days readings (read multiple short chapters on the same day, long chapters are the only one read that day).  I also want to read through the book in order (no skipping ahead to combine short chapters that are not naturally next to each other.
>
> My thought was that the optim function with method="SANN" would be an appropriate approach, but my first couple of tries did not give very good results.  I have since come up with an optim with SANN solution that gives what I consider good results (but I accept that better is possible).
>
> Below is a data frame with the lengths of the chapters for the book that originally sparked the challenge for me (but the general idea should work for any book).  Each row represents a chapter (in order) with 3 different measures of the length of the chapter.
>
> For this challenge I want to read the book in 128 days (there are 239 chapters).
>
> I will post my solutions in a few days, but I want to wait so that my direction does not influence people from trying other approaches (if there is something better than optim, that is fine).
>
> Good luck for anyone interested in the challenge,
>
> The data frame:
>
> bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi 1",
> "1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
> "1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
> "1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
> "1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
> "1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
> "2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
> "2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
> "2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
> "2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
> "2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
> "2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
> "Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
> "Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
> "Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah 7",
> "Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
> "Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
> "Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
> "Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
> "Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
> "Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
> "Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
> "Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
> "Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
> "Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
> "Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
> "Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
> "Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
> "Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
> "Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
> "Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
> "Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
> "Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
> "3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
> "3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
> "3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
> "3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
> "3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
> "3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
> "4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
> "Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether 1",
> "Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
> "Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
> "Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni 4",
> "Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni 10"
> ), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
> 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
> 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
> 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
> 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
> 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
> 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
> 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
> 957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
> 963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
> 1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
> 1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
> 1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
> 708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
> 793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
> 1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
> 1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
> 2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
> 1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
> 871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
> 1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
> 1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
> 454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
> 1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
> 119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters = c(3746L,
> 3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
> 5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
> 2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
> 1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
> 1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
> 7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
> 3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
> 5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
> 3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
> 2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
> 5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
> 1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
> 3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
> 6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
> 3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
> 5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
> 7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
> 7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
> 7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
> 3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
> 3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
> 3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
> 3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
> 3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
> 3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
> 5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
> 4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
> 6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
> 25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
> 21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
> 35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
> 33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
> 39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
> 32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
> 37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
> 31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
> 41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
> 38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
> 41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
> 33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
> 28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
> 2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
> "Verses"), row.names = c(NA, -239L), class = "data.frame")
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [hidden email]
> 801.408.8111
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

Greg Snow-2
WOW, your results give about half the variance of my best optim run (possibly due to my suboptimal use of optim).

Can you describe a little what the algorithm is doing?

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111


> -----Original Message-----
> From: Albyn Jones [mailto:[hidden email]]
> Sent: Tuesday, January 12, 2010 5:31 PM
> To: Greg Snow
> Cc: [hidden email]
> Subject: Re: [R] optimization challenge
>
> Greg
>
> Nice problem: I wasted my whole day on it :-)
>
> I was explaining my plan for a solution to a colleague who is a
> computer scientist, he pointed out that I was trying to re-invent the
> wheel known as dynamic programming.  here is my code, apparently it is
> called "bottom up dynamic programming".  It runs pretty quickly, and
> returns (what I hope is :-) the optimal sum of squares and the
> cut-points.
>
> function(X=bom3$Verses,days=128){
> # find optimal BOM reading schedule for Greg Snow
> # minimize variance of quantity to read per day over 128 days
> #
> N = length(X)
> Nm1 = N-1
> SSQ<- matrix(NA,nrow=days,ncol=N)
> Cuts <- list()
> #
> #  SSQ[i,j]: the ssqs about the overall mean for the optimal partition
> #           for i days on the chapters 1 to j
> #
> M = sum(X)/days
> CS = cumsum(X)
> SSQ[1,]= (CS-M)^2
> Cuts[[1]]= as.list(1:N)
> #
> for(m in 2:days){
> Cuts[[m]]=list()
> #for(i in 1:(m-1)) Cuts[[m]][[i]] = Cuts[[m-1]][[i]]
> for(n in m:N){
>            CS = cumsum(X[n:1])[n:1]
>              SSQ1 = (CS-M)^2
>      j = (m-1):(n-1)
>              TS = SSQ[m-1,j]+(SSQ1[j+1])
>  SSQ[m,n] = min(TS)
>                   k = min(which((min(TS)== TS)))+m-1
>                   Cuts[[m]][[n]] = c(Cuts[[m-1]][[k-1]],n)
> }
> }
> list(SSQ=SSQ[days,N],Cuts=Cuts[[days]][[N]])
> }
>
> $SSQ
> [1] 11241.05
>
> $Cuts
>   [1]   2   4   7   9  11  13  15  16  17  19  21  23  25  27  30  31
> 34  37
>  [19]  39  41  44  46  48  50  53  56  59  60  62  64  66  68  70  73
> 75  77
>  [37]  78  80  82  84  86  88  89  91  92  94  95  96  97  99 100 103
> 105 106
>  [55] 108 110 112 113 115 117 119 121 124 125 126 127 129 131 132 135
> 137 138
>  [73] 140 141 142 144 145 146 148 150 151 152 154 156 157 160 162 163
> 164 166
>  [91] 167 169 171 173 175 177 179 181 183 185 186 188 190 192 193 194
> 196 199
> [109] 201 204 205 207 209 211 213 214 215 217 220 222 223 225 226 228
> 234 236
> [127] 238 239
>
>
>
>
> On Tue, Jan 12, 2010 at 11:33:36AM -0700, Greg Snow wrote:
> > I have a challenge that I want to share with the group.
> >
> > This is not homework (but I may assign it as such if I teach the
> appropriate class again) and I have found one solution, so don't need
> anything urgent.  This is more for fun to see if others can find a
> better solution than I did.
> >
> > The challenge:
> >
> > I want to read a book in a given number of days.  I want to read an
> integer number of chapters each day (there are more chapters than
> days), no stopping part way through a chapter, and at least 1 chapter
> each day.  The chapters are very non uniform in length (some very
> short, a few very long, many in between) so I would like to come up
> with a reading schedule that minimizes the variance of the length of
> the days readings (read multiple short chapters on the same day, long
> chapters are the only one read that day).  I also want to read through
> the book in order (no skipping ahead to combine short chapters that are
> not naturally next to each other.
> >
> > My thought was that the optim function with method="SANN" would be an
> appropriate approach, but my first couple of tries did not give very
> good results.  I have since come up with an optim with SANN solution
> that gives what I consider good results (but I accept that better is
> possible).
> >
> > Below is a data frame with the lengths of the chapters for the book
> that originally sparked the challenge for me (but the general idea
> should work for any book).  Each row represents a chapter (in order)
> with 3 different measures of the length of the chapter.
> >
> > For this challenge I want to read the book in 128 days (there are 239
> chapters).
> >
> > I will post my solutions in a few days, but I want to wait so that my
> direction does not influence people from trying other approaches (if
> there is something better than optim, that is fine).
> >
> > Good luck for anyone interested in the challenge,
> >
> > The data frame:
> >
> > bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi
> 1",
> > "1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
> > "1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
> > "1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
> > "1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
> > "1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
> > "2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
> > "2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
> > "2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
> > "2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
> > "2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
> > "2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
> > "Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
> > "Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
> > "Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah
> 7",
> > "Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
> > "Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
> > "Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
> > "Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
> > "Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
> > "Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
> > "Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
> > "Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
> > "Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
> > "Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
> > "Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
> > "Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
> > "Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
> > "Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
> > "Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
> > "Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
> > "Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
> > "Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
> > "3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
> > "3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
> > "3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
> > "3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
> > "3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
> > "3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
> > "4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
> > "Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether
> 1",
> > "Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
> > "Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
> > "Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni
> 4",
> > "Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni
> 10"
> > ), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
> > 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
> > 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
> > 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
> > 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
> > 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
> > 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
> > 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
> > 957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
> > 963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
> > 1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
> > 1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
> > 1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
> > 708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
> > 793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
> > 1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
> > 1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
> > 2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
> > 1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
> > 871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
> > 1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
> > 1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
> > 454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
> > 1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
> > 119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters =
> c(3746L,
> > 3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
> > 5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
> > 2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
> > 1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
> > 1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
> > 7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
> > 3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
> > 5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
> > 3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
> > 2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
> > 5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
> > 1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
> > 3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
> > 6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
> > 3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
> > 5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
> > 7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
> > 7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
> > 7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
> > 3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
> > 3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
> > 3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
> > 3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
> > 3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
> > 3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
> > 5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
> > 4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
> > 6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
> > 25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
> > 21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
> > 35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
> > 33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
> > 39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
> > 32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
> > 37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
> > 31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
> > 41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
> > 38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
> > 41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
> > 33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
> > 28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
> > 2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
> > "Verses"), row.names = c(NA, -239L), class = "data.frame")
> >
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > [hidden email]
> > 801.408.8111
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

jones-2
The key idea is that you are building a matrix that contains the
solutions to smaller problems which are sub-problems of the big
problem.  The first row of the matrix SSQ contains the solution for no
splits, ie SSQ[1,j] is just the sum of squares about the overall mean
for reading chapters1 through j in one day.  The iteration then uses
row m-1 to construct row m, since if SSQ[m-1,j] (optimal reading of j
chapters in m-1 days) is part of the overall optimal solution, you
have already computed it, and so don't ever need to recompute it.

       TS = SSQ[m-1,j]+(SSQ1[j+1])

computes the vector of possible solutions for SSQ[m,n] (n chapters in n days)
breaking it into two pieces: chapters 1 to j in m-1 days, and chapters j+1 to
n in 1 day.  j is a vector in the function, and min(TS) is the minimum
over choices of j, ie SSQ[m,n].

At the end, SSQ[128,239] is the optimal value for reading all 239
chapters in 128 days.  That's just the objective function, so the rest
involves constructing the list of optimal cuts, ie which chapters are
grouped together for each day's reading.  That code uses the same
idea... constructing a list of lists of cutpoints.

statisticians should study a bit of data structures and algorithms!

albyn

On Wed, Jan 13, 2010 at 10:45:11AM -0700, Greg Snow wrote:

> WOW, your results give about half the variance of my best optim run (possibly due to my suboptimal use of optim).
>
> Can you describe a little what the algorithm is doing?
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [hidden email]
> 801.408.8111
>
>
> > -----Original Message-----
> > From: Albyn Jones [mailto:[hidden email]]
> > Sent: Tuesday, January 12, 2010 5:31 PM
> > To: Greg Snow
> > Cc: [hidden email]
> > Subject: Re: [R] optimization challenge
> >
> > Greg
> >
> > Nice problem: I wasted my whole day on it :-)
> >
> > I was explaining my plan for a solution to a colleague who is a
> > computer scientist, he pointed out that I was trying to re-invent the
> > wheel known as dynamic programming.  here is my code, apparently it is
> > called "bottom up dynamic programming".  It runs pretty quickly, and
> > returns (what I hope is :-) the optimal sum of squares and the
> > cut-points.
> >
> > function(X=bom3$Verses,days=128){
> > # find optimal BOM reading schedule for Greg Snow
> > # minimize variance of quantity to read per day over 128 days
> > #
> > N = length(X)
> > Nm1 = N-1
> > SSQ<- matrix(NA,nrow=days,ncol=N)
> > Cuts <- list()
> > #
> > #  SSQ[i,j]: the ssqs about the overall mean for the optimal partition
> > #           for i days on the chapters 1 to j
> > #
> > M = sum(X)/days
> > CS = cumsum(X)
> > SSQ[1,]= (CS-M)^2
> > Cuts[[1]]= as.list(1:N)
> > #
> > for(m in 2:days){
> > Cuts[[m]]=list()
> > #for(i in 1:(m-1)) Cuts[[m]][[i]] = Cuts[[m-1]][[i]]
> > for(n in m:N){
> >            CS = cumsum(X[n:1])[n:1]
> >              SSQ1 = (CS-M)^2
> >      j = (m-1):(n-1)
> >              TS = SSQ[m-1,j]+(SSQ1[j+1])
> >  SSQ[m,n] = min(TS)
> >                   k = min(which((min(TS)== TS)))+m-1
> >                   Cuts[[m]][[n]] = c(Cuts[[m-1]][[k-1]],n)
> > }
> > }
> > list(SSQ=SSQ[days,N],Cuts=Cuts[[days]][[N]])
> > }
> >
> > $SSQ
> > [1] 11241.05
> >
> > $Cuts
> >   [1]   2   4   7   9  11  13  15  16  17  19  21  23  25  27  30  31
> > 34  37
> >  [19]  39  41  44  46  48  50  53  56  59  60  62  64  66  68  70  73
> > 75  77
> >  [37]  78  80  82  84  86  88  89  91  92  94  95  96  97  99 100 103
> > 105 106
> >  [55] 108 110 112 113 115 117 119 121 124 125 126 127 129 131 132 135
> > 137 138
> >  [73] 140 141 142 144 145 146 148 150 151 152 154 156 157 160 162 163
> > 164 166
> >  [91] 167 169 171 173 175 177 179 181 183 185 186 188 190 192 193 194
> > 196 199
> > [109] 201 204 205 207 209 211 213 214 215 217 220 222 223 225 226 228
> > 234 236
> > [127] 238 239
> >
> >
> >
> >
> > On Tue, Jan 12, 2010 at 11:33:36AM -0700, Greg Snow wrote:
> > > I have a challenge that I want to share with the group.
> > >
> > > This is not homework (but I may assign it as such if I teach the
> > appropriate class again) and I have found one solution, so don't need
> > anything urgent.  This is more for fun to see if others can find a
> > better solution than I did.
> > >
> > > The challenge:
> > >
> > > I want to read a book in a given number of days.  I want to read an
> > integer number of chapters each day (there are more chapters than
> > days), no stopping part way through a chapter, and at least 1 chapter
> > each day.  The chapters are very non uniform in length (some very
> > short, a few very long, many in between) so I would like to come up
> > with a reading schedule that minimizes the variance of the length of
> > the days readings (read multiple short chapters on the same day, long
> > chapters are the only one read that day).  I also want to read through
> > the book in order (no skipping ahead to combine short chapters that are
> > not naturally next to each other.
> > >
> > > My thought was that the optim function with method="SANN" would be an
> > appropriate approach, but my first couple of tries did not give very
> > good results.  I have since come up with an optim with SANN solution
> > that gives what I consider good results (but I accept that better is
> > possible).
> > >
> > > Below is a data frame with the lengths of the chapters for the book
> > that originally sparked the challenge for me (but the general idea
> > should work for any book).  Each row represents a chapter (in order)
> > with 3 different measures of the length of the chapter.
> > >
> > > For this challenge I want to read the book in 128 days (there are 239
> > chapters).
> > >
> > > I will post my solutions in a few days, but I want to wait so that my
> > direction does not influence people from trying other approaches (if
> > there is something better than optim, that is fine).
> > >
> > > Good luck for anyone interested in the challenge,
> > >
> > > The data frame:
> > >
> > > bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi
> > 1",
> > > "1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
> > > "1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
> > > "1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
> > > "1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
> > > "1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
> > > "2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
> > > "2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
> > > "2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
> > > "2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
> > > "2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
> > > "2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
> > > "Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
> > > "Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
> > > "Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah
> > 7",
> > > "Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
> > > "Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
> > > "Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
> > > "Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
> > > "Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
> > > "Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
> > > "Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
> > > "Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
> > > "Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
> > > "Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
> > > "Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
> > > "Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
> > > "Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
> > > "Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
> > > "Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
> > > "Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
> > > "Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
> > > "Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
> > > "3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
> > > "3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
> > > "3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
> > > "3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
> > > "3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
> > > "3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
> > > "4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
> > > "Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether
> > 1",
> > > "Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
> > > "Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
> > > "Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni
> > 4",
> > > "Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni
> > 10"
> > > ), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
> > > 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
> > > 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
> > > 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
> > > 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
> > > 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
> > > 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
> > > 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
> > > 957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
> > > 963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
> > > 1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
> > > 1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
> > > 1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
> > > 708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
> > > 793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
> > > 1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
> > > 1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
> > > 2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
> > > 1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
> > > 871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
> > > 1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
> > > 1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
> > > 454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
> > > 1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
> > > 119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters =
> > c(3746L,
> > > 3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
> > > 5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
> > > 2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
> > > 1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
> > > 1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
> > > 7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
> > > 3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
> > > 5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
> > > 3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
> > > 2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
> > > 5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
> > > 1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
> > > 3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
> > > 6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
> > > 3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
> > > 5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
> > > 7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
> > > 7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
> > > 7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
> > > 3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
> > > 3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
> > > 3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
> > > 3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
> > > 3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
> > > 3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
> > > 5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
> > > 4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
> > > 6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
> > > 25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
> > > 21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
> > > 35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
> > > 33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
> > > 39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
> > > 32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
> > > 37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
> > > 31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
> > > 41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
> > > 38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
> > > 41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
> > > 33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
> > > 28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
> > > 2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
> > > "Verses"), row.names = c(NA, -239L), class = "data.frame")
> > >
> > >
> > > --
> > > Gregory (Greg) L. Snow Ph.D.
> > > Statistical Data Center
> > > Intermountain Healthcare
> > > [hidden email]
> > > 801.408.8111
> > >
> > > ______________________________________________
> > > [hidden email] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

Ravi Varadhan
Greg - thanks for posting this interesting problem.

Albyn - thanks for posting a solution.  Now, I have some questions: (1) is
the algorithm guaranteed to find a "best" solution? (2) can there be
multiple solutions (it seems like there can be more than 1 solution
depending on the data)?, and (3) is there a good reference for this and
similar algorithms?

Thanks & Best,
Ravi.

----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [hidden email]

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 

----------------------------------------------------------------------------
--------


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Albyn Jones
Sent: Wednesday, January 13, 2010 1:19 PM
To: Greg Snow
Cc: [hidden email]
Subject: Re: [R] optimization challenge

The key idea is that you are building a matrix that contains the
solutions to smaller problems which are sub-problems of the big
problem.  The first row of the matrix SSQ contains the solution for no
splits, ie SSQ[1,j] is just the sum of squares about the overall mean
for reading chapters1 through j in one day.  The iteration then uses
row m-1 to construct row m, since if SSQ[m-1,j] (optimal reading of j
chapters in m-1 days) is part of the overall optimal solution, you
have already computed it, and so don't ever need to recompute it.

       TS = SSQ[m-1,j]+(SSQ1[j+1])

computes the vector of possible solutions for SSQ[m,n] (n chapters in n
days)
breaking it into two pieces: chapters 1 to j in m-1 days, and chapters j+1
to
n in 1 day.  j is a vector in the function, and min(TS) is the minimum
over choices of j, ie SSQ[m,n].

At the end, SSQ[128,239] is the optimal value for reading all 239
chapters in 128 days.  That's just the objective function, so the rest
involves constructing the list of optimal cuts, ie which chapters are
grouped together for each day's reading.  That code uses the same
idea... constructing a list of lists of cutpoints.

statisticians should study a bit of data structures and algorithms!

albyn

On Wed, Jan 13, 2010 at 10:45:11AM -0700, Greg Snow wrote:
> WOW, your results give about half the variance of my best optim run
(possibly due to my suboptimal use of optim).

>
> Can you describe a little what the algorithm is doing?
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [hidden email]
> 801.408.8111
>
>
> > -----Original Message-----
> > From: Albyn Jones [mailto:[hidden email]]
> > Sent: Tuesday, January 12, 2010 5:31 PM
> > To: Greg Snow
> > Cc: [hidden email]
> > Subject: Re: [R] optimization challenge
> >
> > Greg
> >
> > Nice problem: I wasted my whole day on it :-)
> >
> > I was explaining my plan for a solution to a colleague who is a
> > computer scientist, he pointed out that I was trying to re-invent the
> > wheel known as dynamic programming.  here is my code, apparently it is
> > called "bottom up dynamic programming".  It runs pretty quickly, and
> > returns (what I hope is :-) the optimal sum of squares and the
> > cut-points.
> >
> > function(X=bom3$Verses,days=128){
> > # find optimal BOM reading schedule for Greg Snow
> > # minimize variance of quantity to read per day over 128 days
> > #
> > N = length(X)
> > Nm1 = N-1
> > SSQ<- matrix(NA,nrow=days,ncol=N)
> > Cuts <- list()
> > #
> > #  SSQ[i,j]: the ssqs about the overall mean for the optimal partition
> > #           for i days on the chapters 1 to j
> > #
> > M = sum(X)/days
> > CS = cumsum(X)
> > SSQ[1,]= (CS-M)^2
> > Cuts[[1]]= as.list(1:N)
> > #
> > for(m in 2:days){
> > Cuts[[m]]=list()
> > #for(i in 1:(m-1)) Cuts[[m]][[i]] = Cuts[[m-1]][[i]]
> > for(n in m:N){
> >            CS = cumsum(X[n:1])[n:1]
> >              SSQ1 = (CS-M)^2
> >      j = (m-1):(n-1)
> >              TS = SSQ[m-1,j]+(SSQ1[j+1])
> >  SSQ[m,n] = min(TS)
> >                   k = min(which((min(TS)== TS)))+m-1
> >                   Cuts[[m]][[n]] = c(Cuts[[m-1]][[k-1]],n)
> > }
> > }
> > list(SSQ=SSQ[days,N],Cuts=Cuts[[days]][[N]])
> > }
> >
> > $SSQ
> > [1] 11241.05
> >
> > $Cuts
> >   [1]   2   4   7   9  11  13  15  16  17  19  21  23  25  27  30  31
> > 34  37
> >  [19]  39  41  44  46  48  50  53  56  59  60  62  64  66  68  70  73
> > 75  77
> >  [37]  78  80  82  84  86  88  89  91  92  94  95  96  97  99 100 103
> > 105 106
> >  [55] 108 110 112 113 115 117 119 121 124 125 126 127 129 131 132 135
> > 137 138
> >  [73] 140 141 142 144 145 146 148 150 151 152 154 156 157 160 162 163
> > 164 166
> >  [91] 167 169 171 173 175 177 179 181 183 185 186 188 190 192 193 194
> > 196 199
> > [109] 201 204 205 207 209 211 213 214 215 217 220 222 223 225 226 228
> > 234 236
> > [127] 238 239
> >
> >
> >
> >
> > On Tue, Jan 12, 2010 at 11:33:36AM -0700, Greg Snow wrote:
> > > I have a challenge that I want to share with the group.
> > >
> > > This is not homework (but I may assign it as such if I teach the
> > appropriate class again) and I have found one solution, so don't need
> > anything urgent.  This is more for fun to see if others can find a
> > better solution than I did.
> > >
> > > The challenge:
> > >
> > > I want to read a book in a given number of days.  I want to read an
> > integer number of chapters each day (there are more chapters than
> > days), no stopping part way through a chapter, and at least 1 chapter
> > each day.  The chapters are very non uniform in length (some very
> > short, a few very long, many in between) so I would like to come up
> > with a reading schedule that minimizes the variance of the length of
> > the days readings (read multiple short chapters on the same day, long
> > chapters are the only one read that day).  I also want to read through
> > the book in order (no skipping ahead to combine short chapters that are
> > not naturally next to each other.
> > >
> > > My thought was that the optim function with method="SANN" would be an
> > appropriate approach, but my first couple of tries did not give very
> > good results.  I have since come up with an optim with SANN solution
> > that gives what I consider good results (but I accept that better is
> > possible).
> > >
> > > Below is a data frame with the lengths of the chapters for the book
> > that originally sparked the challenge for me (but the general idea
> > should work for any book).  Each row represents a chapter (in order)
> > with 3 different measures of the length of the chapter.
> > >
> > > For this challenge I want to read the book in 128 days (there are 239
> > chapters).
> > >
> > > I will post my solutions in a few days, but I want to wait so that my
> > direction does not influence people from trying other approaches (if
> > there is something better than optim, that is fine).
> > >
> > > Good luck for anyone interested in the challenge,
> > >
> > > The data frame:
> > >
> > > bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi
> > 1",
> > > "1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
> > > "1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
> > > "1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
> > > "1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
> > > "1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
> > > "2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
> > > "2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
> > > "2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
> > > "2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
> > > "2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
> > > "2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
> > > "Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
> > > "Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
> > > "Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah
> > 7",
> > > "Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
> > > "Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
> > > "Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
> > > "Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
> > > "Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
> > > "Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
> > > "Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
> > > "Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
> > > "Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
> > > "Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
> > > "Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
> > > "Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
> > > "Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
> > > "Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
> > > "Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
> > > "Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
> > > "Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
> > > "Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
> > > "3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
> > > "3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
> > > "3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
> > > "3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
> > > "3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
> > > "3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
> > > "4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
> > > "Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether
> > 1",
> > > "Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
> > > "Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
> > > "Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni
> > 4",
> > > "Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni
> > 10"
> > > ), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
> > > 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
> > > 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
> > > 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
> > > 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
> > > 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
> > > 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
> > > 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
> > > 957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
> > > 963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
> > > 1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
> > > 1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
> > > 1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
> > > 708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
> > > 793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
> > > 1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
> > > 1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
> > > 2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
> > > 1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
> > > 871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
> > > 1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
> > > 1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
> > > 454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
> > > 1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
> > > 119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters =
> > c(3746L,
> > > 3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
> > > 5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
> > > 2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
> > > 1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
> > > 1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
> > > 7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
> > > 3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
> > > 5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
> > > 3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
> > > 2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
> > > 5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
> > > 1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
> > > 3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
> > > 6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
> > > 3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
> > > 5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
> > > 7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
> > > 7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
> > > 7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
> > > 3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
> > > 3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
> > > 3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
> > > 3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
> > > 3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
> > > 3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
> > > 5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
> > > 4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
> > > 6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
> > > 25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
> > > 21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
> > > 35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
> > > 33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
> > > 39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
> > > 32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
> > > 37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
> > > 31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
> > > 41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
> > > 38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
> > > 41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
> > > 33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
> > > 28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
> > > 2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
> > > "Verses"), row.names = c(NA, -239L), class = "data.frame")
> > >
> > >
> > > --
> > > Gregory (Greg) L. Snow Ph.D.
> > > Statistical Data Center
> > > Intermountain Healthcare
> > > [hidden email]
> > > 801.408.8111
> > >
> > > ______________________________________________
> > > [hidden email] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

Aaron Mackey
FYI, in bioinformatics, we use dynamic programming algorithms in similar
ways to solve similar problems of finding guaranteed-optimal partitions in
streams of data (usually DNA or protein sequence, but sometimes numerical
data from chip-arrays).  These "path optimization" algorithms are often
called Viterbi algorithms, a web search for which should provide multiple
references.

The solutions are not necessarily unique (there may be multiple
paths/partitions with identical integer maxima in some systems) and there is
much research on whether the optimal solution is actually the one you want
to work with (for example, there may be a fair amount of probability mass
within an area/ensemble of suboptimal solutions that overall have greater
posterior probabilities than does the optimal solution "singleton").  See
Chip Lawrence's PNAS paper for more erudite discussion, and references
therein: www.pnas.org/content/105/9/3209.abstract

-Aaron

P.S. Good to see you here Albyn -- I enjoyed your stat. methods course at
Reed back in 1993, which started me down a somewhat windy road to
statistical genomics!

--
Aaron J. Mackey, PhD
Assistant Professor
Center for Public Health Genomics
University of Virginia
[hidden email]


On Wed, Jan 13, 2010 at 5:23 PM, Ravi Varadhan <[hidden email]> wrote:

> Greg - thanks for posting this interesting problem.
>
> Albyn - thanks for posting a solution.  Now, I have some questions: (1) is
> the algorithm guaranteed to find a "best" solution? (2) can there be
> multiple solutions (it seems like there can be more than 1 solution
> depending on the data)?, and (3) is there a good reference for this and
> similar algorithms?
>
> Thanks & Best,
> Ravi.
>
>
> ----------------------------------------------------------------------------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: [hidden email]
>
> Webpage:
>
> http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
> tml<http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml>
>
>
>
>
> ----------------------------------------------------------------------------
> --------
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]]
> On
> Behalf Of Albyn Jones
> Sent: Wednesday, January 13, 2010 1:19 PM
> To: Greg Snow
> Cc: [hidden email]
> Subject: Re: [R] optimization challenge
>
> The key idea is that you are building a matrix that contains the
> solutions to smaller problems which are sub-problems of the big
> problem.  The first row of the matrix SSQ contains the solution for no
> splits, ie SSQ[1,j] is just the sum of squares about the overall mean
> for reading chapters1 through j in one day.  The iteration then uses
> row m-1 to construct row m, since if SSQ[m-1,j] (optimal reading of j
> chapters in m-1 days) is part of the overall optimal solution, you
> have already computed it, and so don't ever need to recompute it.
>
>       TS = SSQ[m-1,j]+(SSQ1[j+1])
>
> computes the vector of possible solutions for SSQ[m,n] (n chapters in n
> days)
> breaking it into two pieces: chapters 1 to j in m-1 days, and chapters j+1
> to
> n in 1 day.  j is a vector in the function, and min(TS) is the minimum
> over choices of j, ie SSQ[m,n].
>
> At the end, SSQ[128,239] is the optimal value for reading all 239
> chapters in 128 days.  That's just the objective function, so the rest
> involves constructing the list of optimal cuts, ie which chapters are
> grouped together for each day's reading.  That code uses the same
> idea... constructing a list of lists of cutpoints.
>
> statisticians should study a bit of data structures and algorithms!
>
> albyn
>
> On Wed, Jan 13, 2010 at 10:45:11AM -0700, Greg Snow wrote:
> > WOW, your results give about half the variance of my best optim run
> (possibly due to my suboptimal use of optim).
> >
> > Can you describe a little what the algorithm is doing?
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > [hidden email]
> > 801.408.8111
> >
> >
> > > -----Original Message-----
> > > From: Albyn Jones [mailto:[hidden email]]
> > > Sent: Tuesday, January 12, 2010 5:31 PM
> > > To: Greg Snow
> > > Cc: [hidden email]
> > > Subject: Re: [R] optimization challenge
> > >
> > > Greg
> > >
> > > Nice problem: I wasted my whole day on it :-)
> > >
> > > I was explaining my plan for a solution to a colleague who is a
> > > computer scientist, he pointed out that I was trying to re-invent the
> > > wheel known as dynamic programming.  here is my code, apparently it is
> > > called "bottom up dynamic programming".  It runs pretty quickly, and
> > > returns (what I hope is :-) the optimal sum of squares and the
> > > cut-points.
> > >
> > > function(X=bom3$Verses,days=128){
> > > # find optimal BOM reading schedule for Greg Snow
> > > # minimize variance of quantity to read per day over 128 days
> > > #
> > > N = length(X)
> > > Nm1 = N-1
> > > SSQ<- matrix(NA,nrow=days,ncol=N)
> > > Cuts <- list()
> > > #
> > > #  SSQ[i,j]: the ssqs about the overall mean for the optimal partition
> > > #           for i days on the chapters 1 to j
> > > #
> > > M = sum(X)/days
> > > CS = cumsum(X)
> > > SSQ[1,]= (CS-M)^2
> > > Cuts[[1]]= as.list(1:N)
> > > #
> > > for(m in 2:days){
> > > Cuts[[m]]=list()
> > > #for(i in 1:(m-1)) Cuts[[m]][[i]] = Cuts[[m-1]][[i]]
> > > for(n in m:N){
> > >                       CS = cumsum(X[n:1])[n:1]
> > >                       SSQ1 = (CS-M)^2
> > >               j = (m-1):(n-1)
> > >                       TS = SSQ[m-1,j]+(SSQ1[j+1])
> > >               SSQ[m,n] = min(TS)
> > >                   k = min(which((min(TS)== TS)))+m-1
> > >                   Cuts[[m]][[n]] = c(Cuts[[m-1]][[k-1]],n)
> > > }
> > > }
> > > list(SSQ=SSQ[days,N],Cuts=Cuts[[days]][[N]])
> > > }
> > >
> > > $SSQ
> > > [1] 11241.05
> > >
> > > $Cuts
> > >   [1]   2   4   7   9  11  13  15  16  17  19  21  23  25  27  30  31
> > > 34  37
> > >  [19]  39  41  44  46  48  50  53  56  59  60  62  64  66  68  70  73
> > > 75  77
> > >  [37]  78  80  82  84  86  88  89  91  92  94  95  96  97  99 100 103
> > > 105 106
> > >  [55] 108 110 112 113 115 117 119 121 124 125 126 127 129 131 132 135
> > > 137 138
> > >  [73] 140 141 142 144 145 146 148 150 151 152 154 156 157 160 162 163
> > > 164 166
> > >  [91] 167 169 171 173 175 177 179 181 183 185 186 188 190 192 193 194
> > > 196 199
> > > [109] 201 204 205 207 209 211 213 214 215 217 220 222 223 225 226 228
> > > 234 236
> > > [127] 238 239
> > >
> > >
> > >
> > >
> > > On Tue, Jan 12, 2010 at 11:33:36AM -0700, Greg Snow wrote:
> > > > I have a challenge that I want to share with the group.
> > > >
> > > > This is not homework (but I may assign it as such if I teach the
> > > appropriate class again) and I have found one solution, so don't need
> > > anything urgent.  This is more for fun to see if others can find a
> > > better solution than I did.
> > > >
> > > > The challenge:
> > > >
> > > > I want to read a book in a given number of days.  I want to read an
> > > integer number of chapters each day (there are more chapters than
> > > days), no stopping part way through a chapter, and at least 1 chapter
> > > each day.  The chapters are very non uniform in length (some very
> > > short, a few very long, many in between) so I would like to come up
> > > with a reading schedule that minimizes the variance of the length of
> > > the days readings (read multiple short chapters on the same day, long
> > > chapters are the only one read that day).  I also want to read through
> > > the book in order (no skipping ahead to combine short chapters that are
> > > not naturally next to each other.
> > > >
> > > > My thought was that the optim function with method="SANN" would be an
> > > appropriate approach, but my first couple of tries did not give very
> > > good results.  I have since come up with an optim with SANN solution
> > > that gives what I consider good results (but I accept that better is
> > > possible).
> > > >
> > > > Below is a data frame with the lengths of the chapters for the book
> > > that originally sparked the challenge for me (but the general idea
> > > should work for any book).  Each row represents a chapter (in order)
> > > with 3 different measures of the length of the chapter.
> > > >
> > > > For this challenge I want to read the book in 128 days (there are 239
> > > chapters).
> > > >
> > > > I will post my solutions in a few days, but I want to wait so that my
> > > direction does not influence people from trying other approaches (if
> > > there is something better than optim, that is fine).
> > > >
> > > > Good luck for anyone interested in the challenge,
> > > >
> > > > The data frame:
> > > >
> > > > bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi
> > > 1",
> > > > "1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
> > > > "1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
> > > > "1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
> > > > "1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
> > > > "1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
> > > > "2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
> > > > "2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
> > > > "2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
> > > > "2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
> > > > "2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
> > > > "2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
> > > > "Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
> > > > "Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
> > > > "Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah
> > > 7",
> > > > "Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
> > > > "Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
> > > > "Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
> > > > "Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
> > > > "Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
> > > > "Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
> > > > "Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
> > > > "Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
> > > > "Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
> > > > "Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
> > > > "Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
> > > > "Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
> > > > "Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
> > > > "Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
> > > > "Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
> > > > "Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
> > > > "Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
> > > > "Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
> > > > "3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
> > > > "3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
> > > > "3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
> > > > "3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
> > > > "3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
> > > > "3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
> > > > "4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
> > > > "Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether
> > > 1",
> > > > "Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
> > > > "Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
> > > > "Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni
> > > 4",
> > > > "Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni
> > > 10"
> > > > ), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
> > > > 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
> > > > 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
> > > > 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
> > > > 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
> > > > 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
> > > > 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
> > > > 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
> > > > 957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
> > > > 963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
> > > > 1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
> > > > 1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
> > > > 1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
> > > > 708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
> > > > 793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
> > > > 1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
> > > > 1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
> > > > 2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
> > > > 1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
> > > > 871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
> > > > 1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
> > > > 1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
> > > > 454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
> > > > 1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
> > > > 119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters =
> > > c(3746L,
> > > > 3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
> > > > 5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
> > > > 2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
> > > > 1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
> > > > 1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
> > > > 7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
> > > > 3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
> > > > 5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
> > > > 3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
> > > > 2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
> > > > 5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
> > > > 1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
> > > > 3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
> > > > 6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
> > > > 3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
> > > > 5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
> > > > 7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
> > > > 7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
> > > > 7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
> > > > 3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
> > > > 3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
> > > > 3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
> > > > 3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
> > > > 3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
> > > > 3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
> > > > 5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
> > > > 4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
> > > > 6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
> > > > 25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
> > > > 21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
> > > > 35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
> > > > 33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
> > > > 39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
> > > > 32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
> > > > 37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
> > > > 31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
> > > > 41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
> > > > 38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
> > > > 41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
> > > > 33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
> > > > 28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
> > > > 2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
> > > > "Verses"), row.names = c(NA, -239L), class = "data.frame")
> > > >
> > > >
> > > > --
> > > > Gregory (Greg) L. Snow Ph.D.
> > > > Statistical Data Center
> > > > Intermountain Healthcare
> > > > [hidden email]
> > > > 801.408.8111
> > > >
> > > > ______________________________________________
> > > > [hidden email] mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

jones-2
Hi Aaron!  It's always nice to see a former student doing well.

Thanks for the notes and references, too!

albyn

On Wed, Jan 13, 2010 at 07:29:57PM -0500, Aaron Mackey wrote:

> FYI, in bioinformatics, we use dynamic programming algorithms in similar
> ways to solve similar problems of finding guaranteed-optimal partitions in
> streams of data (usually DNA or protein sequence, but sometimes numerical
> data from chip-arrays).  These "path optimization" algorithms are often
> called Viterbi algorithms, a web search for which should provide multiple
> references.
>
> The solutions are not necessarily unique (there may be multiple
> paths/partitions with identical integer maxima in some systems) and there is
> much research on whether the optimal solution is actually the one you want
> to work with (for example, there may be a fair amount of probability mass
> within an area/ensemble of suboptimal solutions that overall have greater
> posterior probabilities than does the optimal solution "singleton").  See
> Chip Lawrence's PNAS paper for more erudite discussion, and references
> therein: www.pnas.org/content/105/9/3209.abstract
>
> -Aaron
>
> P.S. Good to see you here Albyn -- I enjoyed your stat. methods course at
> Reed back in 1993, which started me down a somewhat windy road to
> statistical genomics!
>
> --
> Aaron J. Mackey, PhD
> Assistant Professor
> Center for Public Health Genomics
> University of Virginia
> [hidden email]
>
>
> On Wed, Jan 13, 2010 at 5:23 PM, Ravi Varadhan <[hidden email]> wrote:
>
> > Greg - thanks for posting this interesting problem.
> >
> > Albyn - thanks for posting a solution.  Now, I have some questions: (1) is
> > the algorithm guaranteed to find a "best" solution? (2) can there be
> > multiple solutions (it seems like there can be more than 1 solution
> > depending on the data)?, and (3) is there a good reference for this and
> > similar algorithms?
> >
> > Thanks & Best,
> > Ravi.
> >
> >
> > ----------------------------------------------------------------------------
> > -------
> >
> > Ravi Varadhan, Ph.D.
> >
> > Assistant Professor, The Center on Aging and Health
> >
> > Division of Geriatric Medicine and Gerontology
> >
> > Johns Hopkins University
> >
> > Ph: (410) 502-2619
> >
> > Fax: (410) 614-9625
> >
> > Email: [hidden email]
> >
> > Webpage:
> >
> > http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
> > tml<http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml>
> >
> >
> >
> >
> > ----------------------------------------------------------------------------
> > --------
> >
> >
> > -----Original Message-----
> > From: [hidden email] [mailto:[hidden email]]
> > On
> > Behalf Of Albyn Jones
> > Sent: Wednesday, January 13, 2010 1:19 PM
> > To: Greg Snow
> > Cc: [hidden email]
> > Subject: Re: [R] optimization challenge
> >
> > The key idea is that you are building a matrix that contains the
> > solutions to smaller problems which are sub-problems of the big
> > problem.  The first row of the matrix SSQ contains the solution for no
> > splits, ie SSQ[1,j] is just the sum of squares about the overall mean
> > for reading chapters1 through j in one day.  The iteration then uses
> > row m-1 to construct row m, since if SSQ[m-1,j] (optimal reading of j
> > chapters in m-1 days) is part of the overall optimal solution, you
> > have already computed it, and so don't ever need to recompute it.
> >
> >       TS = SSQ[m-1,j]+(SSQ1[j+1])
> >
> > computes the vector of possible solutions for SSQ[m,n] (n chapters in n
> > days)
> > breaking it into two pieces: chapters 1 to j in m-1 days, and chapters j+1
> > to
> > n in 1 day.  j is a vector in the function, and min(TS) is the minimum
> > over choices of j, ie SSQ[m,n].
> >
> > At the end, SSQ[128,239] is the optimal value for reading all 239
> > chapters in 128 days.  That's just the objective function, so the rest
> > involves constructing the list of optimal cuts, ie which chapters are
> > grouped together for each day's reading.  That code uses the same
> > idea... constructing a list of lists of cutpoints.
> >
> > statisticians should study a bit of data structures and algorithms!
> >
> > albyn
> >
> > On Wed, Jan 13, 2010 at 10:45:11AM -0700, Greg Snow wrote:
> > > WOW, your results give about half the variance of my best optim run
> > (possibly due to my suboptimal use of optim).
> > >
> > > Can you describe a little what the algorithm is doing?
> > >
> > > --
> > > Gregory (Greg) L. Snow Ph.D.
> > > Statistical Data Center
> > > Intermountain Healthcare
> > > [hidden email]
> > > 801.408.8111
> > >
> > >
> > > > -----Original Message-----
> > > > From: Albyn Jones [mailto:[hidden email]]
> > > > Sent: Tuesday, January 12, 2010 5:31 PM
> > > > To: Greg Snow
> > > > Cc: [hidden email]
> > > > Subject: Re: [R] optimization challenge
> > > >
> > > > Greg
> > > >
> > > > Nice problem: I wasted my whole day on it :-)
> > > >
> > > > I was explaining my plan for a solution to a colleague who is a
> > > > computer scientist, he pointed out that I was trying to re-invent the
> > > > wheel known as dynamic programming.  here is my code, apparently it is
> > > > called "bottom up dynamic programming".  It runs pretty quickly, and
> > > > returns (what I hope is :-) the optimal sum of squares and the
> > > > cut-points.
> > > >
> > > > function(X=bom3$Verses,days=128){
> > > > # find optimal BOM reading schedule for Greg Snow
> > > > # minimize variance of quantity to read per day over 128 days
> > > > #
> > > > N = length(X)
> > > > Nm1 = N-1
> > > > SSQ<- matrix(NA,nrow=days,ncol=N)
> > > > Cuts <- list()
> > > > #
> > > > #  SSQ[i,j]: the ssqs about the overall mean for the optimal partition
> > > > #           for i days on the chapters 1 to j
> > > > #
> > > > M = sum(X)/days
> > > > CS = cumsum(X)
> > > > SSQ[1,]= (CS-M)^2
> > > > Cuts[[1]]= as.list(1:N)
> > > > #
> > > > for(m in 2:days){
> > > > Cuts[[m]]=list()
> > > > #for(i in 1:(m-1)) Cuts[[m]][[i]] = Cuts[[m-1]][[i]]
> > > > for(n in m:N){
> > > >                       CS = cumsum(X[n:1])[n:1]
> > > >                       SSQ1 = (CS-M)^2
> > > >               j = (m-1):(n-1)
> > > >                       TS = SSQ[m-1,j]+(SSQ1[j+1])
> > > >               SSQ[m,n] = min(TS)
> > > >                   k = min(which((min(TS)== TS)))+m-1
> > > >                   Cuts[[m]][[n]] = c(Cuts[[m-1]][[k-1]],n)
> > > > }
> > > > }
> > > > list(SSQ=SSQ[days,N],Cuts=Cuts[[days]][[N]])
> > > > }
> > > >
> > > > $SSQ
> > > > [1] 11241.05
> > > >
> > > > $Cuts
> > > >   [1]   2   4   7   9  11  13  15  16  17  19  21  23  25  27  30  31
> > > > 34  37
> > > >  [19]  39  41  44  46  48  50  53  56  59  60  62  64  66  68  70  73
> > > > 75  77
> > > >  [37]  78  80  82  84  86  88  89  91  92  94  95  96  97  99 100 103
> > > > 105 106
> > > >  [55] 108 110 112 113 115 117 119 121 124 125 126 127 129 131 132 135
> > > > 137 138
> > > >  [73] 140 141 142 144 145 146 148 150 151 152 154 156 157 160 162 163
> > > > 164 166
> > > >  [91] 167 169 171 173 175 177 179 181 183 185 186 188 190 192 193 194
> > > > 196 199
> > > > [109] 201 204 205 207 209 211 213 214 215 217 220 222 223 225 226 228
> > > > 234 236
> > > > [127] 238 239
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Jan 12, 2010 at 11:33:36AM -0700, Greg Snow wrote:
> > > > > I have a challenge that I want to share with the group.
> > > > >
> > > > > This is not homework (but I may assign it as such if I teach the
> > > > appropriate class again) and I have found one solution, so don't need
> > > > anything urgent.  This is more for fun to see if others can find a
> > > > better solution than I did.
> > > > >
> > > > > The challenge:
> > > > >
> > > > > I want to read a book in a given number of days.  I want to read an
> > > > integer number of chapters each day (there are more chapters than
> > > > days), no stopping part way through a chapter, and at least 1 chapter
> > > > each day.  The chapters are very non uniform in length (some very
> > > > short, a few very long, many in between) so I would like to come up
> > > > with a reading schedule that minimizes the variance of the length of
> > > > the days readings (read multiple short chapters on the same day, long
> > > > chapters are the only one read that day).  I also want to read through
> > > > the book in order (no skipping ahead to combine short chapters that are
> > > > not naturally next to each other.
> > > > >
> > > > > My thought was that the optim function with method="SANN" would be an
> > > > appropriate approach, but my first couple of tries did not give very
> > > > good results.  I have since come up with an optim with SANN solution
> > > > that gives what I consider good results (but I accept that better is
> > > > possible).
> > > > >
> > > > > Below is a data frame with the lengths of the chapters for the book
> > > > that originally sparked the challenge for me (but the general idea
> > > > should work for any book).  Each row represents a chapter (in order)
> > > > with 3 different measures of the length of the chapter.
> > > > >
> > > > > For this challenge I want to read the book in 128 days (there are 239
> > > > chapters).
> > > > >
> > > > > I will post my solutions in a few days, but I want to wait so that my
> > > > direction does not influence people from trying other approaches (if
> > > > there is something better than optim, that is fine).
> > > > >
> > > > > Good luck for anyone interested in the challenge,
> > > > >
> > > > > The data frame:
> > > > >
> > > > > bom3 <- structure(list(Chapter = structure(1:239, .Label = c("1 Nephi
> > > > 1",
> > > > > "1 Nephi 2", "1 Nephi 3", "1 Nephi 4", "1 Nephi 5", "1 Nephi 6",
> > > > > "1 Nephi 7", "1 Nephi 8", "1 Nephi 9", "1 Nephi 10", "1 Nephi 11",
> > > > > "1 Nephi 12", "1 Nephi 13", "1 Nephi 14", "1 Nephi 15", "1 Nephi 16",
> > > > > "1 Nephi 17", "1 Nephi 18", "1 Nephi 19", "1 Nephi 20", "1 Nephi 21",
> > > > > "1 Nephi 22", "2 Nephi 1", "2 Nephi 2", "2 Nephi 3", "2 Nephi 4",
> > > > > "2 Nephi 5", "2 Nephi 6", "2 Nephi 7", "2 Nephi 8", "2 Nephi 9",
> > > > > "2 Nephi 10", "2 Nephi 11", "2 Nephi 12", "2 Nephi 13", "2 Nephi 14",
> > > > > "2 Nephi 15", "2 Nephi 16", "2 Nephi 17", "2 Nephi 18", "2 Nephi 19",
> > > > > "2 Nephi 20", "2 Nephi 21", "2 Nephi 22", "2 Nephi 23", "2 Nephi 24",
> > > > > "2 Nephi 25", "2 Nephi 26", "2 Nephi 27", "2 Nephi 28", "2 Nephi 29",
> > > > > "2 Nephi 30", "2 Nephi 31", "2 Nephi 32", "2 Nephi 33", "Jacob 1",
> > > > > "Jacob 2", "Jacob 3", "Jacob 4", "Jacob 5", "Jacob 6", "Jacob 7",
> > > > > "Enos 1", "Jarom 1", "Omni 1", "Words of Mormon 1", "Mosiah 1",
> > > > > "Mosiah 2", "Mosiah 3", "Mosiah 4", "Mosiah 5", "Mosiah 6", "Mosiah
> > > > 7",
> > > > > "Mosiah 8", "Mosiah 9", "Mosiah 10", "Mosiah 11", "Mosiah 12",
> > > > > "Mosiah 13", "Mosiah 14", "Mosiah 15", "Mosiah 16", "Mosiah 17",
> > > > > "Mosiah 18", "Mosiah 19", "Mosiah 20", "Mosiah 21", "Mosiah 22",
> > > > > "Mosiah 23", "Mosiah 24", "Mosiah 25", "Mosiah 26", "Mosiah 27",
> > > > > "Mosiah 28", "Mosiah 29", "Alma 1", "Alma 2", "Alma 3", "Alma 4",
> > > > > "Alma 5", "Alma 6", "Alma 7", "Alma 8", "Alma 9", "Alma 10",
> > > > > "Alma 11", "Alma 12", "Alma 13", "Alma 14", "Alma 15", "Alma 16",
> > > > > "Alma 17", "Alma 18", "Alma 19", "Alma 20", "Alma 21", "Alma 22",
> > > > > "Alma 23", "Alma 24", "Alma 25", "Alma 26", "Alma 27", "Alma 28",
> > > > > "Alma 29", "Alma 30", "Alma 31", "Alma 32", "Alma 33", "Alma 34",
> > > > > "Alma 35", "Alma 36", "Alma 37", "Alma 38", "Alma 39", "Alma 40",
> > > > > "Alma 41", "Alma 42", "Alma 43", "Alma 44", "Alma 45", "Alma 46",
> > > > > "Alma 47", "Alma 48", "Alma 49", "Alma 50", "Alma 51", "Alma 52",
> > > > > "Alma 53", "Alma 54", "Alma 55", "Alma 56", "Alma 57", "Alma 58",
> > > > > "Alma 59", "Alma 60", "Alma 61", "Alma 62", "Alma 63", "Helaman 1",
> > > > > "Helaman 2", "Helaman 3", "Helaman 4", "Helaman 5", "Helaman 6",
> > > > > "Helaman 7", "Helaman 8", "Helaman 9", "Helaman 10", "Helaman 11",
> > > > > "Helaman 12", "Helaman 13", "Helaman 14", "Helaman 15", "Helaman 16",
> > > > > "3 Nephi 1", "3 Nephi 2", "3 Nephi 3", "3 Nephi 4", "3 Nephi 5",
> > > > > "3 Nephi 6", "3 Nephi 7", "3 Nephi 8", "3 Nephi 9", "3 Nephi 10",
> > > > > "3 Nephi 11", "3 Nephi 12", "3 Nephi 13", "3 Nephi 14", "3 Nephi 15",
> > > > > "3 Nephi 16", "3 Nephi 17", "3 Nephi 18", "3 Nephi 19", "3 Nephi 20",
> > > > > "3 Nephi 21", "3 Nephi 22", "3 Nephi 23", "3 Nephi 24", "3 Nephi 25",
> > > > > "3 Nephi 26", "3 Nephi 27", "3 Nephi 28", "3 Nephi 29", "3 Nephi 30",
> > > > > "4 Nephi 1", "Mormon 1", "Mormon 2", "Mormon 3", "Mormon 4",
> > > > > "Mormon 5", "Mormon 6", "Mormon 7", "Mormon 8", "Mormon 9", "Ether
> > > > 1",
> > > > > "Ether 2", "Ether 3", "Ether 4", "Ether 5", "Ether 6", "Ether 7",
> > > > > "Ether 8", "Ether 9", "Ether 10", "Ether 11", "Ether 12", "Ether 13",
> > > > > "Ether 14", "Ether 15", "Moroni 1", "Moroni 2", "Moroni 3", "Moroni
> > > > 4",
> > > > > "Moroni 5", "Moroni 6", "Moroni 7", "Moroni 8", "Moroni 9", "Moroni
> > > > 10"
> > > > > ), class = "factor"), Words = c(908L, 879L, 1067L, 1262L, 761L,
> > > > > 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L,
> > > > > 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L,
> > > > > 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L,
> > > > > 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L,
> > > > > 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L,
> > > > > 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L,
> > > > > 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L,
> > > > > 957L, 1271L, 1291L, 1073L, 391L, 1056L, 560L, 650L, 1382L, 978L,
> > > > > 963L, 1328L, 620L, 1186L, 954L, 837L, 1200L, 1601L, 812L, 1879L,
> > > > > 1496L, 1433L, 1016L, 1035L, 2787L, 390L, 1443L, 1231L, 1511L,
> > > > > 1442L, 1466L, 1858L, 1347L, 1526L, 711L, 1010L, 1848L, 1623L,
> > > > > 1701L, 1279L, 955L, 1871L, 764L, 1509L, 752L, 1665L, 1260L, 575L,
> > > > > 708L, 2666L, 1478L, 1837L, 855L, 1576L, 699L, 1229L, 2027L, 651L,
> > > > > 793L, 1153L, 701L, 1229L, 2221L, 1243L, 911L, 1809L, 1572L, 1073L,
> > > > > 1345L, 1734L, 1682L, 1757L, 1085L, 988L, 1218L, 2177L, 1517L,
> > > > > 1748L, 489L, 1777L, 854L, 2220L, 593L, 1418L, 599L, 1527L, 1076L,
> > > > > 2084L, 1797L, 1142L, 1210L, 1480L, 720L, 1561L, 873L, 1855L,
> > > > > 1241L, 824L, 1025L, 1340L, 769L, 1353L, 1518L, 931L, 1200L, 1132L,
> > > > > 871L, 965L, 807L, 1434L, 1294L, 834L, 628L, 731L, 913L, 879L,
> > > > > 1344L, 1233L, 1538L, 1155L, 507L, 415L, 620L, 183L, 793L, 1269L,
> > > > > 1457L, 394L, 130L, 1949L, 706L, 1262L, 939L, 822L, 1067L, 914L,
> > > > > 454L, 1710L, 1510L, 901L, 1370L, 1253L, 892L, 226L, 1048L, 899L,
> > > > > 1231L, 1424L, 1415L, 762L, 1536L, 1219L, 1132L, 1314L, 147L,
> > > > > 119L, 120L, 153L, 91L, 335L, 1929L, 1064L, 1012L, 1149L), Letters =
> > > > c(3746L,
> > > > > 3640L, 4295L, 4968L, 3202L, 777L, 3941L, 4769L, 1075L, 3829L,
> > > > > 5159L, 3527L, 7874L, 5236L, 6149L, 6555L, 10180L, 4960L, 5399L,
> > > > > 2767L, 3758L, 6275L, 6391L, 6151L, 4685L, 5267L, 4767L, 3780L,
> > > > > 1574L, 3261L, 9985L, 4106L, 1332L, 2557L, 2454L, 810L, 3540L,
> > > > > 1406L, 2668L, 2304L, 2474L, 3715L, 2090L, 533L, 2442L, 3587L,
> > > > > 7294L, 6363L, 5809L, 4997L, 3197L, 2941L, 4065L, 1743L, 2539L,
> > > > > 3149L, 5815L, 2765L, 4027L, 15366L, 2091L, 4988L, 4730L, 3149L,
> > > > > 5906L, 3704L, 4188L, 8675L, 4771L, 6586L, 2965L, 1313L, 6326L,
> > > > > 3948L, 3489L, 3942L, 5205L, 5309L, 4300L, 1574L, 4559L, 2393L,
> > > > > 2655L, 5755L, 4070L, 4027L, 5649L, 2651L, 4978L, 3970L, 3610L,
> > > > > 5112L, 6733L, 3513L, 7990L, 6356L, 6179L, 4340L, 4432L, 11260L,
> > > > > 1657L, 5913L, 5033L, 6274L, 5835L, 5865L, 7802L, 5776L, 6295L,
> > > > > 3034L, 4383L, 7842L, 6659L, 6905L, 5104L, 4162L, 7733L, 3321L,
> > > > > 6241L, 3212L, 6840L, 5294L, 2471L, 2748L, 10762L, 6269L, 7479L,
> > > > > 3416L, 6477L, 3043L, 4719L, 8764L, 2494L, 3168L, 4790L, 2983L,
> > > > > 5129L, 9795L, 5043L, 3913L, 7635L, 6693L, 4669L, 5968L, 7508L,
> > > > > 7271L, 7543L, 4730L, 4095L, 5152L, 9129L, 6395L, 7357L, 2166L,
> > > > > 7512L, 3544L, 9554L, 2505L, 6171L, 2568L, 6542L, 4781L, 8655L,
> > > > > 7721L, 4751L, 5134L, 6032L, 3013L, 6536L, 3504L, 7527L, 5106L,
> > > > > 3607L, 4247L, 5524L, 3404L, 5947L, 6506L, 3891L, 5256L, 4944L,
> > > > > 3700L, 4006L, 3392L, 5732L, 5185L, 3402L, 2451L, 2962L, 3679L,
> > > > > 3550L, 5429L, 5083L, 6354L, 4659L, 2155L, 1754L, 2496L, 721L,
> > > > > 3326L, 4917L, 5856L, 1589L, 564L, 8300L, 2987L, 5317L, 3820L,
> > > > > 3595L, 4483L, 3664L, 1805L, 6931L, 6225L, 3508L, 5481L, 5007L,
> > > > > 3623L, 891L, 4265L, 3746L, 5041L, 5823L, 5726L, 3238L, 6443L,
> > > > > 5053L, 4664L, 5370L, 614L, 459L, 491L, 629L, 351L, 1457L, 7812L,
> > > > > 4677L, 4305L, 4603L), Verses = c(20, 24, 31, 38, 22, 6, 22, 38,
> > > > > 6, 22, 36, 23, 42, 30, 36, 39, 55, 25, 24, 22, 26, 31, 32, 30,
> > > > > 25, 35, 34, 18, 11, 25, 54, 25, 8, 22, 26, 6, 30, 13, 25, 22,
> > > > > 21, 34, 16, 6, 22, 32, 30, 33, 35, 32, 14, 18, 21, 9, 15, 19,
> > > > > 35, 14, 18, 77, 13, 27, 27, 15, 30, 18, 18, 41, 27, 30, 15, 7,
> > > > > 33, 21, 19, 22, 29, 37, 35, 12, 31, 15, 20, 35, 29, 26, 36, 16,
> > > > > 39, 25, 24, 39, 37, 20, 47, 33, 38, 27, 20, 62, 8, 27, 32, 34,
> > > > > 32, 46, 37, 31, 29, 19, 21, 39, 43, 36, 30, 23, 35, 18, 30, 17,
> > > > > 37, 30, 14, 17, 60, 38, 43, 23, 41, 16, 30, 47, 15, 19, 26, 15,
> > > > > 31, 54, 24, 24, 41, 36, 25, 30, 40, 37, 40, 23, 24, 35, 57, 36,
> > > > > 41, 13, 36, 21, 52, 17, 34, 14, 37, 26, 52, 41, 29, 28, 41, 19,
> > > > > 38, 26, 39, 31, 17, 25, 30, 19, 26, 33, 26, 30, 26, 25, 22, 19,
> > > > > 41, 48, 34, 27, 24, 20, 25, 39, 36, 46, 29, 17, 14, 18, 6, 21,
> > > > > 33, 39, 9, 2, 49, 19, 29, 22, 23, 24, 22, 10, 41, 37, 43, 25,
> > > > > 28, 19, 6, 30, 27, 26, 35, 34, 23, 41, 31, 31, 34, 4, 3, 4, 3,
> > > > > 2, 9, 48, 30, 26, 34)), .Names = c("Chapter", "Words", "Letters",
> > > > > "Verses"), row.names = c(NA, -239L), class = "data.frame")
> > > > >
> > > > >
> > > > > --
> > > > > Gregory (Greg) L. Snow Ph.D.
> > > > > Statistical Data Center
> > > > > Intermountain Healthcare
> > > > > [hidden email]
> > > > > 801.408.8111
> > > > >
> > > > > ______________________________________________
> > > > > [hidden email] mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > > guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > >
> > >
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: optimization challenge

Greg Snow-2
In reply to this post by Greg Snow-2
Well, Albyn Jones gave a great solution to my challenge that found the best reading schedule.

My original thought was that doing an exhaustive search would take too much time, but Albyn showed that there are ways to do it efficiently.

My approach (as mentioned before) was to use optim with method SANN.  I treated it like a balls and urns problem.  Since I wanted to read 239 chapters in 128 days it would be like putting 239 balls in 128 urns.  I started with 1 ball in each urn, then put the remain balls into urns to get a starting state (tried a couple different starting situations, 1 additional ball in each of the 1st urns, all the remaining balls in the first or last urn).  Then my update step was just to take a single ball from one of the urns with 2 or more balls and move it to another urn at random.

On tricky thing with this method is that moving one ball could change things quite a bit because one setup could have the longest chapters being read by themselves, but moving one ball would result in a long chapter now being grouped with others.  My first update function just moved the ball to a random urn, then I tried moving the ball only one urn forward or backwards (this seemed to work better, but probably needed a longer run time).  Finally the best method that I found chose the ball to move proportional to the lengths of the days reading and chose the urn to put it in with highest probability for days with the shortest readings.

I thought my answers were pretty good (they looked reasonable), but Albyn's solution had half the variance as my best result.  Below is the code that I used for my best results in case anyone is interested.  I would also be interested if anyone could find a way to improve on what I did to get better results (help me learn SANN better, the arguments I used came mostly from trial and error).

days <- seq( as.Date('1/24/10',"%m/%d/%y"), as.Date('5/31/10','%m/%d/%y'), by=1)

sq2.2 <- rep(1, length(days))
sq2.2[ length(days) ] <- nrow(bom3) -sum(sq2.2) + 1

genseq4 <- function(sq) {
        w <- rep(1:length(days), sq)
        tmp <- tapply( bom3$Verses, w, sum )

        ww <- which(sq>1)

        dwn <- if (length(ww) > 1) {
                sample( ww, 1, prob= tmp[ww] )
        } else {
                ww
        }
        up  <- sample( seq_along(sq)[-dwn], 1, prob=max(tmp) - tmp[-dwn] )

        sq[dwn] <- sq[dwn] - 1
        sq[up]  <- sq[up]  + 1

        sq
}

distance <- function(sq) {
        w <- rep(1:length(days), sq)
        tmp <- tapply( bom3$Verses, w, sum )
        var(tmp)
}

res <- optim(sq2.2, distance, genseq4, method="SANN",
        control=list(maxit=30000, temp=50, trace=TRUE ))



--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[hidden email]
801.408.8111

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.