The friendly skies
Here are all flights in and out of New York City in 2013 (more details here):
The column arr_delay
shows whether a flight landed early (negative) or late (positive). It has NAs:
any(is.na(flights$arr_delay))
[1] TRUE
(is.na(x)
returns a vector of length x
, each element a TRUE
or FALSE
, but any(is.na(x)
) returns a vector of length 1.)
Let’s filter out the NAs with tidyr::drop_na
and plot the distribution
flights %>%
drop_na() %>%
ggplot(., aes(x = arr_delay)) +
geom_density()
Looks like on average flights arrive on time (who knew!). But we have lots of outliers. Let’s see if we can’t filter some out so we can run an analysis on the bulk of non-outlier observations. The logic is that processes that generate huge delays may be systematically different than processes generating small delays.
Use stat_ecdf()
to plot the cumulative distribution:
flights %>%
drop_na() %>%
ggplot(., aes(x = arr_delay)) +
stat_ecdf() +
geom_vline(xintercept = 100, color = "red", linetype = "dotted")
So almost about 98% of flights have positive delays below 100 minutes. So let’s filter and re-plot the histogram:
flights %>%
drop_na() %>%
filter(arr_delay < 100) %>%
ggplot(., aes(x = arr_delay)) +
geom_density()
(Looks like average flights are actually early!!)
Side note: you can’t log-transform the data to smooth the outliers. Why? Because some arrivals are on time (arr_delay == 0
) and others are early (arr_delay < 0
). The problem is that log(0)
is \(-\infty\) while log(-1)
(or any negative number) is undefined.
Modeling
What predicts late (or early) arrivals?
You could build a simple model of arrival delays as a function of certain variables, like weather, the operating airlines, the type of plane, and other variables plus some error:
\[
\text{arrival delay} = f(\text{explanatory variables}) + \epsilon
\] and back estimate the parameters to explain the conditional average delay (the average delay conditional on your explanatory variables). If your model \(f(\dots)\) is linear then you can just run OLS with lm()
. And so on.
Where are my explanatory variables?
But while flights
does have information about carriers in the column carrier
:
flights %>%
drop_na() %>%
filter(arr_delay < 100) %>%
group_by(carrier) %>%
summarise(avg_delay = mean(arr_delay))
we only have an abbreviation (we’d have to look up those values). More details are stored in airlines
:
More importantly flights
has scant data on planes or weather.
Those data are stored in different objects:
and
Can we join these data sets? Yes. But it’s not as simple as just binding columns together. Each data set has a different number of rows:
[1] 336776
[1] 16
[1] 3322
[1] 26115
Merging data
The challenge is to join these data sets. The key to solving this problem is literally that: the key of each data set.
Data sets can have keys the uniquely idenfity observations. This builds on the idea of “tidy” data: each row a unique observation.
For instance, tailnum
in planes
uniquely identifies each plane. We can see this counting the number of repeats:
planes %>%
count(tailnum) %>%
filter(n > 1)
and finding none.
But tailnum
does not uniquely identify observations in flights
:
flights %>%
count(tailnum) %>%
filter(n > 1)
Because the same plane does many flights over the course of a year.
But tailnum
in flights
maps onto tailnum
in planes
.
In fact, there are several mappings between all the data sets:
and we use the relationships or mappings between data sets to join them (hence “relational data”).
Joins can get really complicated. We will just focus on one type of join: an outer join.
Outer joins
In an outer join you keep all observations in one data set and add what you need from another.
dplyr
has three types of joins: left, right and full:
left_join
is the most common. If you have two data sets x
and y
, then:
left_join(x,y, by = "key")
or
x %>%
left_join(y, by = "key")
says “keep all the data in x
, and tack on the matching observations in y
”. Matchings are identified by “key”. Using the pipe (%>%
) makes it easier to do many joins.
Left joins are also the easiest to think about (in my opinion).
First, identify your “primary” data. In our case it’s flights
.
Then, let the “non-primary” data “join in”.
Checkpoint
Let’s join flights
with planes
using left_join
. The key again is tailnum
.
flights_planes <- flights %>%
left_join(planes, by = "tailnum")
flights_planes
now we have a new data set with the same number of observations as flights
– but with richer information about the type of plane used for each flight.
For instance, is there a relationship between the size of a plane (as measured by the number of engines) and delay time?
flights_planes %>%
drop_na() %>%
group_by(engines) %>%
summarise(mean_delay = mean(arr_delay)) %>%
ggplot(data = ., aes(x = engines, y = mean_delay)) +
geom_col()
Checkpoint
Try to left_join()
the weather
to flights
and create the object flights_weather
. Looking at the relations picture above we see that the keys mapping the two data sets are c("year", "month", "day", "hour", "origin")
.
flights_weather <- flights %>%
left_join(weather, by = c("year", "month", "day", "hour", "origin"))
Is there a relationship between temperature (temp
) and delay time (arr_delay
)?
flights_weather %>%
drop_na() %>%
ggplot(aes(x = temp, y = arr_delay)) +
geom_point(alpha = 0.5)
Checkpoint
Join flights
, airlines
, planes
and weather
. Then test the hypothesis that JetBlue flights are significantly more delayed than American Airlines flights, others using a linear regression via lm()
, controling for the temperature and the number of engines on a plane. The model is:
\[
\begin{aligned}
\text{arrival delay} &= f(\text{airlines}, \text{temperature}, \text{engines}) + \epsilon \\
&= \beta_0 + \beta_1(\text{airlines}) + \beta_2(\text{temperature}) + \beta_3(\text{engines}) + \epsilon
\end{aligned}
\]
First join the data (hint: join weather
before planes
):
flights_enhanced <- flights %>%
left_join(airlines, by = "carrier") %>%
left_join(weather, by = c("year", "month", "day", "hour", "origin")) %>%
left_join(planes, by = "tailnum")
Recall the airlines codes:
Now filter and estimate the model:
flights_enhanced %>%
filter(carrier %in% c("AA", "B6")) %>%
lm(formula = arr_delay ~ name + temp + engines, data = .) %>%
summary()
Call:
lm(formula = arr_delay ~ name + temp + engines, data = .)
Residuals:
Min 1Q Median 3Q Max
-80.00 -23.86 -12.15 7.64 1006.07
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.023253 2.222248 -1.360 0.174
nameJetBlue Airways 9.266306 0.478394 19.370 < 2e-16 ***
temp 0.070717 0.009884 7.155 8.46e-13 ***
engines -0.389717 1.091463 -0.357 0.721
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 43.31 on 62780 degrees of freedom
(24580 observations deleted due to missingness)
Multiple R-squared: 0.006687, Adjusted R-squared: 0.006639
F-statistic: 140.9 on 3 and 62780 DF, p-value: < 2.2e-16
Seems so!
LS0tCnRpdGxlOiAiUmVsYXRpb25hbCBEYXRhIChDb21wbGV0ZWQgTm90ZWJvb2spIgpzdWJ0aXRsZTogIlIgZm9yIERhdGEgU2NpZW5jZSIKYXV0aG9yOiAiTERHIgpvdXRwdXQ6ICAKICBodG1sX25vdGVib29rOgogICAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCiAgICB0aGVtZTogcmVhZGFibGUKICAgIGhpZ2hsaWdodDogcHlnbWVudHMKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OiAKICAgICAgY29sbGFwc2VkOiB5ZXMgICAgICAKLS0tCgojIFNldC11cCB7LX0KCmBgYHtyIGxvYWQgcGFja2FnZXMsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KG55Y2ZsaWdodHMxMykgIyBmb3IgdGhlIHNlY3Rpb24gb24gcmVsYXRpb25hbCBkYXRhCmBgYAoKIyMgQWNrbm93bGVkZ2VtZW50cyB7LX0KClRoaXMgbm90ZWJvb2sgaXMgYmFzZWQgb24gQ2hhcHRlciAxMyBvZiBvZiBbKlIgZm9yIERhdGEgU2NpZW5jZSpdKGh0dHBzOi8vcjRkcy5oYWQuY28ubnovaW5kZXguaHRtbCkuCgojIFRoZSBmcmllbmRseSBza2llcwoKSGVyZSBhcmUgYWxsIGZsaWdodHMgaW4gYW5kIG91dCBvZiBOZXcgWW9yayBDaXR5IGluIDIwMTMgKG1vcmUgZGV0YWlscyBbaGVyZV0oaHR0cHM6Ly9jcmFuLnItcHJvamVjdC5vcmcvd2ViL3BhY2thZ2VzL255Y2ZsaWdodHMxMy9pbmRleC5odG1sKSk6CgpgYGB7ciBsb2FkIGFpcmxpbmVzfQpmbGlnaHRzCmBgYAoKVGhlIGNvbHVtbiBgYXJyX2RlbGF5YCBzaG93cyB3aGV0aGVyIGEgZmxpZ2h0IGxhbmRlZCBlYXJseSAobmVnYXRpdmUpIG9yIGxhdGUgKHBvc2l0aXZlKS4gSXQgaGFzIE5BczoKCmBgYHtyIGFueSBuYXN9CmFueShpcy5uYShmbGlnaHRzJGFycl9kZWxheSkpCmBgYAoKKGBpcy5uYSh4KWAgcmV0dXJucyBhIHZlY3RvciBvZiBsZW5ndGggYHhgLCBlYWNoIGVsZW1lbnQgYSBgVFJVRWAgb3IgYEZBTFNFYCwgYnV0IGBhbnkoaXMubmEoeClgKSByZXR1cm5zIGEgdmVjdG9yIG9mIGxlbmd0aCAxLikKCkxldCdzIGZpbHRlciBvdXQgdGhlIE5BcyB3aXRoIGB0aWR5cjo6ZHJvcF9uYWAgYW5kIHBsb3QgdGhlIGRpc3RyaWJ1dGlvbgoKYGBge3IgcGxvdCBhcnJfZGVsYXl9CmZsaWdodHMgJT4lIAogIGRyb3BfbmEoKSAlPiUgCiAgZ2dwbG90KC4sIGFlcyh4ID0gYXJyX2RlbGF5KSkgKyAKICBnZW9tX2RlbnNpdHkoKQpgYGAKCkxvb2tzIGxpa2Ugb24gYXZlcmFnZSBmbGlnaHRzIGFycml2ZSBvbiB0aW1lICh3aG8ga25ldyEpLiBCdXQgd2UgaGF2ZSBsb3RzIG9mIG91dGxpZXJzLiBMZXQncyBzZWUgaWYgd2UgY2FuJ3QgZmlsdGVyIHNvbWUgb3V0IHNvIHdlIGNhbiBydW4gYW4gYW5hbHlzaXMgb24gdGhlIGJ1bGsgb2Ygbm9uLW91dGxpZXIgb2JzZXJ2YXRpb25zLiBUaGUgbG9naWMgaXMgdGhhdCBwcm9jZXNzZXMgdGhhdCBnZW5lcmF0ZSBodWdlIGRlbGF5cyBtYXkgYmUgc3lzdGVtYXRpY2FsbHkgZGlmZmVyZW50IHRoYW4gcHJvY2Vzc2VzIGdlbmVyYXRpbmcgc21hbGwgZGVsYXlzLgoKVXNlIGBzdGF0X2VjZGYoKWAgdG8gcGxvdCB0aGUgY3VtdWxhdGl2ZSBkaXN0cmlidXRpb246CgpgYGB7ciBjdW1sYXRpdmUgZGlzdHJpYnV0aW9ufQpmbGlnaHRzICU+JSAKICBkcm9wX25hKCkgJT4lIAogIGdncGxvdCguLCBhZXMoeCA9IGFycl9kZWxheSkpICsgCiAgc3RhdF9lY2RmKCkgKwogIGdlb21fdmxpbmUoeGludGVyY2VwdCA9IDEwMCwgY29sb3IgPSAicmVkIiwgbGluZXR5cGUgPSAiZG90dGVkIikKYGBgCgpTbyAqYWxtb3N0KiBhYm91dCA5OCUgb2YgZmxpZ2h0cyBoYXZlIHBvc2l0aXZlIGRlbGF5cyBiZWxvdyAxMDAgbWludXRlcy4gU28gbGV0J3MgZmlsdGVyIGFuZCByZS1wbG90IHRoZSBoaXN0b2dyYW06CgpgYGB7ciBmaWx0ZXIgYW5kIHJlcGxvdH0KZmxpZ2h0cyAlPiUgCiAgZHJvcF9uYSgpICU+JSAKICBmaWx0ZXIoYXJyX2RlbGF5IDwgMTAwKSAlPiUgCiAgZ2dwbG90KC4sIGFlcyh4ID0gYXJyX2RlbGF5KSkgKyAKICBnZW9tX2RlbnNpdHkoKQpgYGAKCihMb29rcyBsaWtlIGF2ZXJhZ2UgZmxpZ2h0cyBhcmUgYWN0dWFsbHkgZWFybHkhISkKClNpZGUgbm90ZTogeW91IGNhbid0IGxvZy10cmFuc2Zvcm0gdGhlIGRhdGEgdG8gc21vb3RoIHRoZSBvdXRsaWVycy4gV2h5PyBCZWNhdXNlIHNvbWUgYXJyaXZhbHMgYXJlIG9uIHRpbWUgKGBhcnJfZGVsYXkgPT0gMGApIGFuZCBvdGhlcnMgYXJlIGVhcmx5IChgYXJyX2RlbGF5IDwgMGApLiBUaGUgcHJvYmxlbSBpcyB0aGF0IGBsb2coMClgIGlzICQtXGluZnR5JCB3aGlsZSBgbG9nKC0xKWAgKG9yIGFueSBuZWdhdGl2ZSBudW1iZXIpIGlzIHVuZGVmaW5lZC4KCiMjIE1vZGVsaW5nCgpXaGF0IHByZWRpY3RzIGxhdGUgKG9yIGVhcmx5KSBhcnJpdmFscz8gCgpZb3UgY291bGQgYnVpbGQgYSBzaW1wbGUgbW9kZWwgb2YgYXJyaXZhbCBkZWxheXMgYXMgYSBmdW5jdGlvbiBvZiBjZXJ0YWluIHZhcmlhYmxlcywgbGlrZSB3ZWF0aGVyLCB0aGUgb3BlcmF0aW5nIGFpcmxpbmVzLCB0aGUgdHlwZSBvZiBwbGFuZSwgYW5kIG90aGVyIHZhcmlhYmxlcyBwbHVzIHNvbWUgZXJyb3I6CgokJApcdGV4dHthcnJpdmFsIGRlbGF5fSA9IGYoXHRleHR7ZXhwbGFuYXRvcnkgdmFyaWFibGVzfSkgKyBcZXBzaWxvbgokJAphbmQgYmFjayBlc3RpbWF0ZSB0aGUgcGFyYW1ldGVycyB0byBleHBsYWluIHRoZSAqKmNvbmRpdGlvbmFsIGF2ZXJhZ2UgZGVsYXkqKiAodGhlIGF2ZXJhZ2UgZGVsYXkgY29uZGl0aW9uYWwgb24geW91ciBleHBsYW5hdG9yeSB2YXJpYWJsZXMpLiBJZiB5b3VyIG1vZGVsICRmKFxkb3RzKSQgaXMgbGluZWFyIHRoZW4geW91IGNhbiBqdXN0IHJ1biBPTFMgd2l0aCBgbG0oKWAuIEFuZCBzbyBvbi4gCgojIyMgV2hlcmUgYXJlIG15IGV4cGxhbmF0b3J5IHZhcmlhYmxlcz8KCkJ1dCB3aGlsZSBgZmxpZ2h0c2AgZG9lcyBoYXZlIGluZm9ybWF0aW9uIGFib3V0IGNhcnJpZXJzIGluIHRoZSBjb2x1bW4gYGNhcnJpZXJgOgoKYGBge3IgZmxpZ2h0cyBjYXJyaWVyLCBtZXNzYWdlID0gRkFMU0V9CmZsaWdodHMgJT4lIAogIGRyb3BfbmEoKSAlPiUgCiAgZmlsdGVyKGFycl9kZWxheSA8IDEwMCkgJT4lIAogIGdyb3VwX2J5KGNhcnJpZXIpICU+JSAKICBzdW1tYXJpc2UoYXZnX2RlbGF5ID0gbWVhbihhcnJfZGVsYXkpKQpgYGAKCndlIG9ubHkgaGF2ZSBhbiBhYmJyZXZpYXRpb24gKHdlJ2QgaGF2ZSB0byBsb29rIHVwIHRob3NlIHZhbHVlcykuIE1vcmUgZGV0YWlscyBhcmUgc3RvcmVkIGluIGBhaXJsaW5lc2A6CgpgYGB7ciBhaXJsaW5lc30KYWlybGluZXMKYGBgCgoKTW9yZSBpbXBvcnRhbnRseSBgZmxpZ2h0c2AgaGFzIHNjYW50IGRhdGEgb24gcGxhbmVzIG9yIHdlYXRoZXIuIAoKVGhvc2UgZGF0YSBhcmUgc3RvcmVkIGluIGRpZmZlcmVudCBvYmplY3RzOgoKYGBge3IgcGxhbmVzfQpwbGFuZXMKYGBgCgphbmQKCmBgYHtyIHdlYXRoZXJ9CndlYXRoZXIKYGBgCgpDYW4gd2Ugam9pbiB0aGVzZSBkYXRhIHNldHM/IFllcy4gQnV0IGl0J3Mgbm90IGFzIHNpbXBsZSBhcyBqdXN0IGJpbmRpbmcgY29sdW1ucyB0b2dldGhlci4gRWFjaCBkYXRhIHNldCBoYXMgYSBkaWZmZXJlbnQgbnVtYmVyIG9mIHJvd3M6CgpgYGB7cn0KbnJvdyhmbGlnaHRzKQpucm93KGFpcmxpbmVzKQpucm93KHBsYW5lcykKbnJvdyh3ZWF0aGVyKQpgYGAKCiMjIE1lcmdpbmcgZGF0YQoKVGhlIGNoYWxsZW5nZSBpcyB0byAqKmpvaW4qKiB0aGVzZSBkYXRhIHNldHMuIFRoZSBrZXkgdG8gc29sdmluZyB0aGlzIHByb2JsZW0gaXMgbGl0ZXJhbGx5IHRoYXQ6IHRoZSAqKmtleSoqIG9mIGVhY2ggZGF0YSBzZXQuIAoKRGF0YSBzZXRzIGNhbiBoYXZlICoqa2V5cyoqIHRoZSB1bmlxdWVseSBpZGVuZml0eSBvYnNlcnZhdGlvbnMuIFRoaXMgYnVpbGRzIG9uIHRoZSBpZGVhIG9mICJ0aWR5IiBkYXRhOiBlYWNoIHJvdyBhIHVuaXF1ZSBvYnNlcnZhdGlvbi4KCkZvciBpbnN0YW5jZSwgYHRhaWxudW1gIGluIGBwbGFuZXNgIHVuaXF1ZWx5IGlkZW50aWZpZXMgZWFjaCBwbGFuZS4gV2UgY2FuIHNlZSB0aGlzIGNvdW50aW5nIHRoZSBudW1iZXIgb2YgcmVwZWF0czoKCmBgYHtyIHRhaWxudW0gcGxhbmVzfQpwbGFuZXMgJT4lIAogIGNvdW50KHRhaWxudW0pICU+JSAKICBmaWx0ZXIobiA+IDEpCmBgYAoKYW5kIGZpbmRpbmcgbm9uZS4gCgpCdXQgYHRhaWxudW1gIGRvZXMgbm90IHVuaXF1ZWx5IGlkZW50aWZ5IG9ic2VydmF0aW9ucyBpbiBgZmxpZ2h0c2A6CgpgYGB7ciB0YWlsbnVtIGZsaWdodHN9CmZsaWdodHMgJT4lIAogIGNvdW50KHRhaWxudW0pICU+JSAKICBmaWx0ZXIobiA+IDEpCmBgYAoKQmVjYXVzZSB0aGUgc2FtZSBwbGFuZSBkb2VzIG1hbnkgZmxpZ2h0cyBvdmVyIHRoZSBjb3Vyc2Ugb2YgYSB5ZWFyLiAKCkJ1dCBgdGFpbG51bWAgaW4gYGZsaWdodHNgIG1hcHMgb250byBgdGFpbG51bWAgaW4gYHBsYW5lc2AuIAoKSW4gZmFjdCwgdGhlcmUgYXJlIHNldmVyYWwgbWFwcGluZ3MgYmV0d2VlbiBhbGwgdGhlIGRhdGEgc2V0czoKCiFbRnJvbSB0aGUgYm9va10oaHR0cHM6Ly9kMzN3dWJyZmtpMGw2OC5jbG91ZGZyb250Lm5ldC8yNDUyOTJkMWVhNzI0ZjZjM2ZkOGE5MjA2M2RjZDdiZmI5NzU4ZDAyLzU3NTFiL2RpYWdyYW1zL3JlbGF0aW9uYWwtbnljZmxpZ2h0cy5wbmcpCgphbmQgd2UgdXNlIHRoZSAqKnJlbGF0aW9uc2hpcHMqKiBvciBtYXBwaW5ncyBiZXR3ZWVuIGRhdGEgc2V0cyB0byBqb2luIHRoZW0gKGhlbmNlICJyZWxhdGlvbmFsIGRhdGEiKS4KCkpvaW5zIGNhbiBnZXQgcmVhbGx5IFtjb21wbGljYXRlZF0oaHR0cHM6Ly9yNGRzLmhhZC5jby5uei9yZWxhdGlvbmFsLWRhdGEuaHRtbCNqb2luLXByb2JsZW1zKS4gV2Ugd2lsbCBqdXN0IGZvY3VzIG9uIG9uZSB0eXBlIG9mIGpvaW46IGFuICoqb3V0ZXIgam9pbioqLgoKIyMgT3V0ZXIgam9pbnMKCkluIGFuIG91dGVyIGpvaW4geW91IGtlZXAgYWxsIG9ic2VydmF0aW9ucyBpbiBvbmUgZGF0YSBzZXQgYW5kIGFkZCB3aGF0IHlvdSBuZWVkIGZyb20gYW5vdGhlci4gCgpgZHBseXJgIGhhcyB0aHJlZSB0eXBlcyBvZiBqb2luczogbGVmdCwgcmlnaHQgYW5kIGZ1bGw6CgohW10oaHR0cHM6Ly9kMzN3dWJyZmtpMGw2OC5jbG91ZGZyb250Lm5ldC85YzEyY2E5ZTEyZWQyNmE3YzVkMmFhMDhlMzZkMmFjNGZiNTkzZjFlLzc5OTgwL2RpYWdyYW1zL2pvaW4tb3V0ZXIucG5nKQoKYGxlZnRfam9pbmAgaXMgdGhlIG1vc3QgY29tbW9uLiBJZiB5b3UgaGF2ZSB0d28gZGF0YSBzZXRzIGB4YCBhbmQgYHlgLCB0aGVuOgoKYGBge3IgbGVmdF9qb2luIG5vIGV2YWwsIGV2YWw9RkFMU0V9CmxlZnRfam9pbih4LHksIGJ5ID0gImtleSIpCmBgYAoKb3IgCgpgYGB7ciBsZWZ0X2pvaW4gbm8gZXZhbCBtZXRob2QgMiwgZXZhbD1GQUxTRX0KeCAlPiUgCiAgbGVmdF9qb2luKHksIGJ5ID0gImtleSIpCmBgYAoKc2F5cyAia2VlcCBhbGwgdGhlIGRhdGEgaW4gYHhgLCBhbmQgdGFjayBvbiB0aGUgbWF0Y2hpbmcgb2JzZXJ2YXRpb25zIGluIGB5YCIuIE1hdGNoaW5ncyBhcmUgaWRlbnRpZmllZCBieSAia2V5Ii4gVXNpbmcgdGhlIHBpcGUgKGAlPiVgKSBtYWtlcyBpdCBlYXNpZXIgdG8gZG8gbWFueSBqb2lucy4KCkxlZnQgam9pbnMgYXJlIGFsc28gdGhlIGVhc2llc3QgdG8gdGhpbmsgYWJvdXQgKGluIG15IG9waW5pb24pLgoKRmlyc3QsIGlkZW50aWZ5IHlvdXIgInByaW1hcnkiIGRhdGEuIEluIG91ciBjYXNlIGl0J3MgYGZsaWdodHNgLgoKVGhlbiwgbGV0IHRoZSAibm9uLXByaW1hcnkiIGRhdGEgImpvaW4gaW4iLiAKCiMjIENoZWNrcG9pbnQKCkxldCdzIGpvaW4gYGZsaWdodHNgIHdpdGggYHBsYW5lc2AgdXNpbmcgYGxlZnRfam9pbmAuIFRoZSBrZXkgYWdhaW4gaXMgYHRhaWxudW1gLgoKYGBge3IgbGVmdF9qb2luIGZsaWdodHMgcGxhbmVzfQpmbGlnaHRzX3BsYW5lcyA8LSBmbGlnaHRzICU+JSAKICBsZWZ0X2pvaW4ocGxhbmVzLCBieSA9ICJ0YWlsbnVtIikKCmZsaWdodHNfcGxhbmVzCmBgYAoKbm93IHdlIGhhdmUgYSBuZXcgZGF0YSBzZXQgd2l0aCB0aGUgc2FtZSBudW1iZXIgb2Ygb2JzZXJ2YXRpb25zIGFzIGBmbGlnaHRzYCAtLSBidXQgd2l0aCByaWNoZXIgaW5mb3JtYXRpb24gYWJvdXQgdGhlIHR5cGUgb2YgcGxhbmUgdXNlZCBmb3IgZWFjaCBmbGlnaHQuIAoKRm9yIGluc3RhbmNlLCBpcyB0aGVyZSBhIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHRoZSBzaXplIG9mIGEgcGxhbmUgKGFzIG1lYXN1cmVkIGJ5IHRoZSBudW1iZXIgb2YgZW5naW5lcykgYW5kIGRlbGF5IHRpbWU/IAoKYGBge3IgcmVsYXRpb25zaGlwIGRlbGF5IHRpbWUgZW5naW5lcywgbWVzc2FnZT1GQUxTRX0KZmxpZ2h0c19wbGFuZXMgJT4lIAogIGRyb3BfbmEoKSAlPiUgCiAgZ3JvdXBfYnkoZW5naW5lcykgJT4lIAogIHN1bW1hcmlzZShtZWFuX2RlbGF5ID0gbWVhbihhcnJfZGVsYXkpKSAlPiUgCiAgZ2dwbG90KGRhdGEgPSAuLCBhZXMoeCA9IGVuZ2luZXMsIHkgPSBtZWFuX2RlbGF5KSkgKyAKICBnZW9tX2NvbCgpCmBgYAoKIyMgQ2hlY2twb2ludAoKVHJ5IHRvIGBsZWZ0X2pvaW4oKWAgdGhlIGB3ZWF0aGVyYCB0byBgZmxpZ2h0c2AgYW5kIGNyZWF0ZSB0aGUgb2JqZWN0IGBmbGlnaHRzX3dlYXRoZXJgLiBMb29raW5nIGF0IHRoZSByZWxhdGlvbnMgcGljdHVyZSBhYm92ZSB3ZSBzZWUgdGhhdCB0aGUgKiprZXlzKiogbWFwcGluZyB0aGUgdHdvIGRhdGEgc2V0cyBhcmUgYGMoInllYXIiLCAibW9udGgiLCAiZGF5IiwgImhvdXIiLCAib3JpZ2luIilgLgoKYGBge3Igam9pbiBmbGlnaHRzIHdlYXRoZXJ9CmZsaWdodHNfd2VhdGhlciA8LSBmbGlnaHRzICU+JSAKICBsZWZ0X2pvaW4od2VhdGhlciwgYnkgPSBjKCJ5ZWFyIiwgIm1vbnRoIiwgImRheSIsICJob3VyIiwgIm9yaWdpbiIpKQpgYGAKCklzIHRoZXJlIGEgcmVsYXRpb25zaGlwIGJldHdlZW4gdGVtcGVyYXR1cmUgKGB0ZW1wYCkgYW5kIGRlbGF5IHRpbWUgKGBhcnJfZGVsYXlgKT8gCgpgYGB7cn0KZmxpZ2h0c193ZWF0aGVyICU+JSAKICBkcm9wX25hKCkgJT4lIAogIGdncGxvdChhZXMoeCA9IHRlbXAsIHkgPSBhcnJfZGVsYXkpKSArIAogIGdlb21fcG9pbnQoYWxwaGEgPSAwLjUpCmBgYAoKIyMgQ2hlY2twb2ludCAKCkpvaW4gYGZsaWdodHNgLCBgYWlybGluZXNgLCBgcGxhbmVzYCBhbmQgYHdlYXRoZXJgLiBUaGVuIHRlc3QgdGhlIGh5cG90aGVzaXMgdGhhdCBKZXRCbHVlIGZsaWdodHMgYXJlIHNpZ25pZmljYW50bHkgbW9yZSBkZWxheWVkIHRoYW4gQW1lcmljYW4gQWlybGluZXMgZmxpZ2h0cywgb3RoZXJzIHVzaW5nIGEgbGluZWFyIHJlZ3Jlc3Npb24gdmlhIGBsbSgpYCwgY29udHJvbGluZyBmb3IgdGhlIHRlbXBlcmF0dXJlIGFuZCB0aGUgbnVtYmVyIG9mIGVuZ2luZXMgb24gYSBwbGFuZS4gVGhlIG1vZGVsIGlzOgoKJCQKXGJlZ2lue2FsaWduZWR9Clx0ZXh0e2Fycml2YWwgZGVsYXl9ICY9IGYoXHRleHR7YWlybGluZXN9LCBcdGV4dHt0ZW1wZXJhdHVyZX0sIFx0ZXh0e2VuZ2luZXN9KSArIFxlcHNpbG9uIFxcCiAgICAgICAgICAgICAgICAgICAgICY9IFxiZXRhXzAgKyBcYmV0YV8xKFx0ZXh0e2FpcmxpbmVzfSkgKyBcYmV0YV8yKFx0ZXh0e3RlbXBlcmF0dXJlfSkgKyBcYmV0YV8zKFx0ZXh0e2VuZ2luZXN9KSArIFxlcHNpbG9uClxlbmR7YWxpZ25lZH0KJCQKCkZpcnN0IGpvaW4gdGhlIGRhdGEgKGhpbnQ6IGpvaW4gYHdlYXRoZXJgIGJlZm9yZSBgcGxhbmVzYCk6CgpgYGB7ciBmbGlnaHRzX2VuaGFuY2VkfQpmbGlnaHRzX2VuaGFuY2VkIDwtIGZsaWdodHMgJT4lIAogIGxlZnRfam9pbihhaXJsaW5lcywgYnkgPSAiY2FycmllciIpICU+JSAKICBsZWZ0X2pvaW4od2VhdGhlciwgYnkgPSBjKCJ5ZWFyIiwgIm1vbnRoIiwgImRheSIsICJob3VyIiwgIm9yaWdpbiIpKSAlPiUgCiAgbGVmdF9qb2luKHBsYW5lcywgYnkgPSAidGFpbG51bSIpIApgYGAKClJlY2FsbCB0aGUgYWlybGluZXMgY29kZXM6CgpgYGB7ciBhaXJsaW5lcyByZWNhbGx9CmFpcmxpbmVzCmBgYAoKTm93IGZpbHRlciBhbmQgZXN0aW1hdGUgdGhlIG1vZGVsOgoKYGBge3IgZmxpZ2h0c19haXJsaW5lc30KZmxpZ2h0c19lbmhhbmNlZCAlPiUgCiAgZmlsdGVyKGNhcnJpZXIgJWluJSBjKCJBQSIsICJCNiIpKSAlPiUgCiAgbG0oZm9ybXVsYSA9IGFycl9kZWxheSB+IG5hbWUgKyB0ZW1wICsgZW5naW5lcywgZGF0YSA9IC4pICU+JSAKICBzdW1tYXJ5KCkKYGBgCgpTZWVtcyBzbyE=