## Wednesday, December 28, 2011

In these times of austerity, one has to be economical with one's wrapping paper.

Yeah, I know, this would have been much more useful about a week ago. But I just never got around to it. And I use the word 'useful' loosely; any saving one might get from these 'techniques' will most likely be insignificant.

Nonetheless...

Rectangles

All gifts - no matter shape or size - can be wrapped with a rectangular piece of paper.

This is easily proved. However doing so may not be the neatest or most efficient approach. For example, it's easy to efficiently wrap a cuboid (book, DVD, box) with a rectangular piece of paper, it's less easy to wrap a bike (efficiently) with a single, and in this case, very large piece. However, in the case of a bike, one could use several, smaller pieces of paper for more efficient wrapping.

For a ball, one might be tempted to say that a more unusual shape - maybe a truncated icosahedral shell - would be more efficient. However, cutting out such a shape would result in lots of little slivers of waste, and would take a lot of time and effort. Instead, it's usually easier, and not greatly inefficient to just use a rectangular piece, and some creative folding and scrunching.

So here's the point - for a given gift, we can define a rectangle (or set of rectangles) which most efficiently wraps that gift.

For a cuboid of sides L(ength), W(idth), H(eight), the minimum paper would be: (H+L) by 2(H+W), with a bit extra for overlap.

For a cylinder of dimensions H(eight) and R(adius), the paper would need to be: 2PI*R by (H+2R), and a bit.

[Proof that these cuts are optimal is left as an exercise for the reader.]

For other shapes, this may be more tricky to determine, but for the sake of arguing, we will assume we know all the required rectangle's dimensions.

Bin Packing

So for a given collection of gifts to be wrapped, we have defined a set of rectangles. We now have to determine the most efficient way to cut these out from rolls of wrapping paper - i.e. a large rectangle, ~ 1m x 4m.

This is similar to solving a two-dimensional version of the Bin Packing problem.

In the Bin Packing problem, we have several bins (cuboids) of fixed dimensions, and a collection of smaller cuboidal objects. The problem is to pack the objects into as few bins a possible - i.e. find an efficient way of fitting cuboids into boxes.

The problem is NP-Hard, meaning that finding the optimal solution can take a very long time. However, there are algorithms which are fast and give adequately optimal solutions.

One, in particular, is called the First Fit algorithm, and it works something like this:

1) Sort the rectangles into size order
2) Cut out the largest pieces still in the set
2 i) If there is not enough paper left on the current roll, move on to the next/start a new roll
3) Repeat (in descending size order) until there are no more pieces left to cut.

There are also instructions for how to place the rectangles, for example first fit from bottom and right, to left and top.
This article gives a more precise description of a similar algorithm, though that one is defined for a 'roll' of non-restricted dimensions.

Realistically

That's all well and good mathematically. But in the real world, it's likely more trouble to solve the problem than the saving in paper is worth.

However, the general principle of the algorithm can be loosely applied (if indeed it isn't the approach you already use) to improve on efficiency.

In general, you can easily sight-guess which gift in going to require the largest amount of paper. And from that you can apply the First Fit algorithm.

One change I make is, cut out and wrap the largest one left, first. Then wrap the gift(s) which best fits the off-cut.
This is particularly good for rolls, where you want to unroll and use it a 'slice' at a time (rather than unrolling the whole thing).

Anyway, something to think about next time you happen to be wrapping a large number of gifts.

Even better if you can get some of that Tesco wrapping paper with the grid on the back.

Oatzy.

[No wasted paper from my wrapping this year. Just saying.]

## Saturday, December 24, 2011

### Christmas Eve

The run up to Christmas is over, and hopefully everyone has their Christmas shopping done. So now seems like as good a time as any to look at what the crowds are doing:
Now that is interesting. I think it is, anyway.

So first of all, the weekday spikes. It seems like a significant number of people left their Christmas shopping to the last week. Now fair enough if they've only just finished work/university/whatever.

Notice also the general week-on-week trend for the weekdays.

But more importantly than all that, look at today - Christmas Eve.

Anti-Crowds

Everyone 'knows' that going shopping on Christmas Eve is suicidal. It's a well known clichÃ©. But that's the thing - everyone thinks it's going to be insanely busy, so a lot of people avoid shopping centres.

But then, this means the shopping centres end up practically deserted (by Christmas standards). And it's not necessarily the result of what I mentioned last time - I was out there, and I saw the lack of crowds.

It's an interesting, and previously studied phenomenon.

I won't go into too much detail, cause it's past midnight on Christmas Eve. But if you're interested in this effect, and in this sort of thing in general, I would suggest reading up on Complexity Theory. In particular, I'd recommend this book, which covers this exact subject in one of the early chapters.

Of course, if we were looking at supermarkets, that would be a different story..

Oatzy.

[Merry Christmas!]

## Sunday, December 18, 2011

### Deal or No Deal

One of the common criticisms of Deal or No Deal (DoND) is that it's basically all just luck, and requires no skill.

Okay, so unlike most TV game-shows you don't have to know anything to play. In fact, based on knowledge/skill required to win and on average pay-out, DoND is probably the best TV show to play on - as said, you don't have to know anything to play, and once you're on the show you're guaranteed to walk away with something.

Okay, so how much you walk away with does indeed depend a lot on luck. But I would argue that it depends on luck in the same way that, for example, Blackjack (21) depends on luck. One might argue that there's no skill in Blackjack, and yet there's no denying that some people fair better at it than others. And one might say this disparity is the result of skill or strategy.

Certainly card counting (when you can get away with it) is a skill, and one which can increase your winnings.

It's How you Play the Game

Behind all the 'quirky' contestants, the blatant superstition and Noel Edmond's interesting shirts, lies a surprisingly complex game. Albeit a 'complex' game which can be played competently with little to no skill.

In fact skill only comes into it when you start trying to tip the probability scales in your favour, without relying on trinkets and 'special numbers'.

But before we start, I should probably outline the game for the sake of anyone unfamiliar:

There are 22 boxes, of which one is yours. Each box contains some unknown amount of money, ranging from 1p to £250,000.

The idea is to try and determine the value of your box by eliminating other boxes from the game. Also trying to determine the value of your box is 'The Banker', who will attempt to make offers to buy your box, and in doing so, try to make a profit for himself.

Your aim is to come out of the game with as much money as possible - ideally, either by selling your box for more than it's worth, or by keeping a box which is more valuable than any of The Banker's offers.

Still with me?

So there's actually a big game theory element in this, when it comes to dealing with The Banker. There's also a risk assessment element, wherein, based on which values have been eliminated or are still in play, you have to decide whether you should play on.

In this blog, I'm going to be focusing on the offers part of the game, and deciding when to deal. I may come back to the other elements in later blog posts.

Let's Play a Card Game

Take a standard deck of 52 cards, shuffle them, and deal 10 cards face down.

You turn the cards over one at a time. If you want to keep the card you turn over, then that's your card forever. You can't swap it for another card later. If you turn over a card you don't want to keep then it's gone forever, and you can't go back to it later.

So, for example, say you turn over the first 4 cards, and you get - 2, 9, 8, 5 - Now you could have stuck on 9, since it's a pretty good card. But there are still 6 cards felt to turn over, so you decide play on.

You turn over 2 more cards - A, Q - Since Aces are low, it seems wise to stick on the Queen, since there's only the King that's better, and only 4 cards left to turn over.

So you turn the last 4 cards over  to see how well you fared, and you get - 7, 5, K, 10.

Okay, so there was a better card than the one you ended up with, but you still came out pretty well.

Anyway, hopefully you get an idea of how the game works.

So this is ultimately a game of chance and of how much risk you're willing to take to get a potentially better card. The question is, is there a strategy for maximising your chances of getting the best possible card?

Obviously, since the King is the absolute best possible card, you might be tempted to play until you find one. But since there's no guarantee that there will be a King on the table, you risk playing to the end, only to end up stuck with the last card, which could be really crappy.

The Dating Game

In the book "How Long is a Piece of String", the author offers a variation on this game as a very simplified analogy for dating - the idea being to settle down with the best possible partner, bearing in mind that you usually won't be able to go back to dating anyone you passed on if you aren't able to find anyone better later, and that once you settle down into a relationship you stop dating, and potentially miss out on meeting someone better.

As far as the strategy goes, the author suggests turning over the first three cards as a sort of sample pool, then settling on the first card whose value exceed those of all three sample cards.

According to his probability measurements, this results in you getting the best card one time out of three. Which, apparently, is as good as you can hope to do.

So in the above example, your sample cards are 2, 9, 8. The first card after that which is greater than all those is, in fact, the Queen.

Like I said, it's not perfect. And a 1 in 3 success rate may not sound very good. But lets be clear - this is (theoretically) your absolute best strategy.

Let's Make a Deal

So ignoring the whole boxes stuff completely (I know, but stay with me here) we can reduce DoND to being a game where a contestant is presented with 7 (random) offers and is trying to decide which one (if any) to take.

And this is where the card game comes in - it's basically the same game, just with fewer offers/cards.

In fact, there are 7 offers, but if you decide to reject all those, you're stuck with the value in your box. So that's equivalent to the card game described above played with 8 cards.

So first of all, I should point out that the three card strategy from the book applies to a slightly different variation on the game I outlined. So we also have to tweak our strategy to match our tweaked game.

First of all, the book itself offers this detail - for a game with N card (choices), the sample size should be N/e, where e=2.71.. So for a game of 8 choices our sample should be 2.9 (~3).

But again, this is for a slightly different game than ours. So we'll want to check that this strategy really does apply to our game.

Thing is, there's a whole bunch of probability calculations involved in working that out. And probability calculations are a pain in the arse. So in lieu of all that, I did a Monte Carlo simulation instead.

Here's the code.

And what the code suggests, is that the best strategy is actually to pass on the first two offers - giving a success rate of about ~49%

For comparison, here's the success rates for different numbers of passes (rejected offers):
And nearly 50:50 isn't bad odds. But we do have to keep in mind that this result is for a simulation of a card game which is a simplified analogy for one element of a game show.

But the implication is this - in a simplified game of DoND, the optimal strategy is to reject the first two offers, and take the first one after that which exceed them.

Spank the Banker

Obviously, DoND, when considered as a whole, is not as simple as accepting or rejecting a series of offers. For one thing, in the real game you have more information (i.e. what values are or aren't still in play) to help inform your decisions.

But this strategy could be considered at least some improvement on the 'by the gut' approach.

So yeah. Until possibly next time..

Oatzy.

[Get it? 'Cause 'deal' can also mean 'to distribute cards'.]

### Panic Saturday

Leaving it til the last Minute

So yesterday was what the tabloids called Panic Saturday - the last Saturday before Christmas (excluding Christmas eve), and the day on which everyone goes batshit crazy 'cause they've just realised, it's a week to Christmas and they haven't finished their Christmas shopping yet!

CHRISTMAS!!

So we go to our check-in data, and we can clearly see tha-
Oh.

It's actually the least busy Saturday of the last four weeks..

Or Is It?

Okay, so professional analysts probably have better metrics for measuring crowd sizes than my armchair data collecting.

So what's going on?

One theory would be that Foursquare users, generally, got their Christmas shopping done relatively early. And it's a nice thought, to think that people who obsessively log where they've been are 'more organised' when it comes to shopping.

But the more likely explanation is this - if places were that busy, and if people were that panicked, maybe they were just to busy to bother stop and check-in to Foursquare.

That's what my money is on, anyway.

And as a final point of interest, here is the updated graph for Westfield London
Notice the massive spikes on weekdays, week 4, compared to previous weeks.

Did a load of people finish work/school at the end of week 3, and only just get started on their shopping? Is it just last minute panic? In light of the point of this post, can we even trust these numbers?!

Also notice that the numbers for this particular shopping centre are quite different from the averages.

So all this kinda puts a damper on the reliability and usefulness (if indeed there ever was any) of my data and results.

Oh well.

Oatzy.

[Or the shops weren't nearly as busy as the papers claim.]

## Friday, December 09, 2011

### Best Day to Go Shopping

The Question

As the title suggests - which is the best (least busy) day to go shopping on?

Or more generally, how do crowds at shopping centres vary over time? Which days are busiest, or least busy? Are the shops getting busier as we get closer to Christmas? Less busy?!

It an interesting question, and one that's probably been looked into before. But still, I had an idea and I'm running with it.

[Feel free to skip straight to the results if you're not interested in statistics and the likes..]

Data Collecting

Data collection on this is tricky. Especially if you don't have legions of people to go out and actually count people. What I want to do is extract numbers with the minimum of effort.

So here's the game - Foursquare.

If you're unfamiliar, Foursquare is a 'social game', for which you 'check-in' to locations and earn points and badges accordingly. It's also good if you're an obsessive types who likes to keep track of where they've been.

So the idea is this - some subset of shoppers will be Foursquare users, who will check-in when they visit any given shopping centre. If we can extract check-in counts over a given time period, hopefully that can be used as an indicator of a place's 'busyness'. Obviously, this is flawed - but more on that below.

Right. So on the Foursquare page for a given venue, there isn't a total historical record of check-ins over time. Instead, what we have is the total number of check-ins at that location up to the time when you loaded the webpage.

What we do, then, is record that number at some fixed time every day (say, midnight). Then the number of check-ins on a given day is the difference between the total at the end of the day and the total for the end of the previous day. Easy.

In fact, to make life a little easier, I wrote this bit of code [python]. All I have to do is remember to run that every evening, and we have our data.

Sampling

There are two possible sources of sampling errors:

1) Location

If we only track one location, we have a very small sample size. That means we're subject to perturbations - for example, a major event like the Christmas lights being switched on - or just general statistical noise. Also, shopping patterns may vary across the country, or depending on how close to a city centre the centre is located, and so on.

It's actually fairly easy to overcome this. First of all, we have this list of the largest shopping centres in the UK. From that list I picked 20 locations to sample. This data can then be normalised and averaged to look for any general patterns that are (relatively) store independent.

Oh, and I should probably mention, since this data is being collected from places in the UK only, patterns may vary for different countries.

2) Users

Using Foursquare data, we're working on the assumption that as the number of shoppers increases (or decreases), the number of Foursquare check-ins will increase in proportion.

This is not necessarily the case.

First of all, we look up the Foursquare user demographics. There is no one source of definitive data on this (that I could find). But to get a general idea, there is this, based on a survey of BART travelers.

Obviously, this demographic source is for users of an American transport service, but I'm assuming it's representative of Foursquare users in general.

From this we see that the typical user is most likely male, age 25-34. Or to put it another way, women, young people, and old people are under-represented. And from my experience, it seems like women and old people are the most common shoppers on weekdays.

So this may introduce a disparity between the data and reality. But it's not one we can really do anything about (without seeking an alternative source of data). So, as long as there is a general size proportionality between shoppers and check-ins, we'll consider the data acceptable.

Results

By far, Saturday is the worst day to go shopping (in terms of crowds). But you already knew that.

So I have my data for the last 3 weeks, for 20 shopping centres across Britain. Here's the raw data, for if you're into that sort of thing.

I worked out the check-ins for each place on each day, then normalised by shopping centre - so that the total number of check-ins for each shopping centre over the three week period now adds up to 100. I then averaged these 'norms' across all shopping centres.

Here's what those results look like.
In fact, I went back and 'tidied up' the data, removing venues with less than 100 check-ins total during the recording period - since their sample sizes were maybe too small for any patterns to be statistically significant - and removed a couple of anomalies (one place ended up with negative check-ins).

This is what the tidy plot looks like
[Updated since original post]

Pretty similar, but some of the bars are now closer together (removes the anomalies from Sunday and Wednesday).

So from the results above, the order of days, from least to most busy, seems to be:

1) Monday
2) Thursday
3) Tuesday
4) Wednesday
5) Sunday
6) Friday
7) Saturday

But note, it's pretty close amongst the top 3 least busy days.

It shouldn't be too surprising that weekdays are less busy than weekends - what with people working.

As for Sunday being so low compared to Friday and Saturday - well, that might be the result of Sunday opening hours.

For example, Meadowhall has typical opening hours of 9am-8pm, but on Sundays it's 11am-5pm -> 11hrs vs 6hrs. So maybe it would make sense to re-adjust accordingly. But deciding how, exactly, to re-adjust is tricky. So we'll just leave it be.

NB/ Wednesday, week 2 maybe distorted due to public sector strikes - there certainly appeared to be more people on the train. But but a lot of that increase was from children (see sample bias above).

Of course, each shopping centre is unique, and there will be variation as to which days are best and worse for each. As an example, here's what the (non-normilised) plot looks like for Westfield London (the most checked-in to shopping centre by far)
Again, pretty similar to the average. But in this case, Wednesday is less busy than the average, and Thursday more. And Friday has that weird dip in week 2, bringing its average down.

One last thing I'd like to point out - notice there is no particular week-on-week trend. That is, the number of check-ins isn't (on average) increasing as we get closer to Christmas. Or decreasing for that matter. Which is, perhaps, not what you'd expect.

But maybe that will change within the next couple of weeks. And certainly after Christmas, when the January sales kick off. Maybe.

This is an on-going project - bear in mind, this is only 3 weeks worth of data, so it may be too soon to draw any solid conclusions - but I will keep you posted. Maybe I'll do another post just after Christmas, or after New Year's. At any rate, I'll tweet it when I do.

Oatzy.

[There's always online shopping..]