Huh, Seven Years. Look at that…

It turns out that seven years ago today I started my trial project with Automattic. I know this because that is when I signed up for user id 23314024 on WordPress.com

Screen Shot 2018-05-11 at 6.25.55 PM.png

My official Automatticversary – as we call the day you became a full time employee – is not until mid-July, but this notification inspired me to be a bit nostalgic. It also happens to fall on the week of my 40th birthday.

My trial project was building a prototype Elasticsearch system (version 0.16 IIRC) so we could search across our internal p2s (disclaimer: no longer what p2 looks like). At the time there were 291k posts and comments across our 89 p2s and MySQL search was no longer working well. There are now almost 2.8m posts and comments across 639 p2s and Elasticsearch is continuing to do pretty well.

We now have six Elasticsearch clusters the largest of which had 6,782,578,164 posts in 2625 shards a few seconds ago. Those clusters have about 850 separate indices that are powering over 66 different use cases. Pretty exciting and humbling how much the company’s usage has grown.

If you are looking to get in on the Search Wrangling fun, then we are hiring. We have a ton of challenging search and relevancy problems to tackle.

 

WordPress and Democratizing Algorithms

In discussing how the newly released Jetpack Search fits with Core WordPress search I veered off into discussing algorithms in general. That generated some healthy discussion on the WPTavern Podcast. I wanted to expand on what I was saying a bit about democratizing algorithms. This is very much related to what JJJ called “race car technology makes its way down to everybody” on the podcast.

I think some of the negative reaction is a personal preference where many readers loudly prefer reverse chronological because it is easy to understand and they feel in control. I think that is a very common opinion among WordPress developers. I’ve had that discussion many, many times. I also think that user control is important.

However, it is very clear that ON AVERAGE having some algorithm filter or reorder the content is much more engaging. It boosts visitor engagement.

Let’s look at some top examples: Twitter, The New York Times, The Washington Post, and Amazon (30% of humans use Facebook, so let’s stipulate that algorithmic is working pretty well for them).

Twitter

Many people are loudly complaining about Twitter not being completely reverse chronological. However, as a product and service, they just had their first profitable quarter ever. And it wasn’t because they have more users. Monthly active users is flat. But daily active users increased 12% year-over-year. They have steadily made their product more engaging to the average user. So while yes many people dislike the algorithmic changes they are making, they also seem to be working.

New York Times and Washington Post

nytimes-top.pngwashpost.pngBeyond even human “algorithms” that order which articles are on the front page, it is very common to have algorithmic sections of the front page of major news sites. Sometimes it is “most emailed stories”, or sometimes it is the most viewed. The current NYT’s home page has a section which looks like the top posts from across the past few days in a nice scrollable display. Maybe they are hand picked, but I bet there is data influencing them. The Washington Post has “Most Read” and “Live Discussions” sections.

Even on this blog I’m using Jetpack’s top viewed posts widget and Jetpack related posts. Both are things that WordPress Core can’t do.

Amazon

Algorithms are everywhere on Amazon. Let’s look at the dog food I buy. The reviews are clearly sorted by what will get me to buy it.

dog-food.pngHow could I not buy it!

It is about Engagement!

None of these examples are easy (or cheap) to deploy right now. It is still pretty hard to build an engaging website. Maybe building an engaging website is plugin territory. That sounds great for Jetpack which has more and more of these features. I’m not sure it is great for the Web though.

This even comes up with simple websites. Gutenberg opens up very interesting questions about how blocks should be organized on a web page. Chris Lema mentioned this when discussing the future of web publishing:

The future can’t continue to be a unidirectional dynamic where someone in marketing determines the best articulation of their message in a single-focused and static design.

The future of publishing is that different people can get different content depending on their behavior, demographics, interest and more.

I think framing it as personally preferring algorithmic vs chronological is really the wrong way to think about it. The question is, if you could flip a switch and get your visitors to spend 20% more time on your website, then would you?

Right now, only big expensive web sites can toggle that switch. Many of them are effectively monopolies. Let’s democratize algorithms so any website can choose to build a highly engaging user experience.

An Aside

While we can and should talk about filter bubbles and the impact that these algorithms have on the world, a world where only monopolistic tech giants can deploy these algorithms is not one where publishing is democratic.

 

Combining Search Scores: Winning and Failing

Trey Jones at Wikimedia Foundation published some very interesting notes up about how to think about combining scores for search ranking (particularly Elasticsearch). I like this insight a lot:

addition is looking for ways to win, multiplication is looking for ways to fail

This is pretty interesting to me when thinking about how I chose to implement the ranking for the WordPress.org plugin search. Applying this insight to the way I combined signals in that ranking function comes up with a couple of interesting observations:

  • The text matching features (phrases, title matches, etc) are looking for ways to win and boost the score. This was a pretty explicit goal of mine, but also partly driven by decoupling the matching of text from boosting on text.
  • All of the other signals are looking for reasons to fail. Not updating the plugin, not testing it on latest WordPress, not resolving support threads, etc. There is some boosting also, but we do a lot to lower scores which is maybe related to some of the exact matching problems I am still looking at (especially after result number 10).

I’m not sure this is either good or bad, just an interesting model for thinking about it and something I need to think about some more. This somewhat matches the intuition that led me to separate out matching text from boosting text with individual features.

I also need to think more about whether I am using the right operations for weighting different scores. There’s a lot of great thoughts in these notes and Trey has a bunch of other notes that look interesting also.

Also it reminds me how great it is to have notes published for others to look at.

 

Senator Bennet, your email address doesn’t work.

I emailed my Senator a few weeks ago to oppose Gorsuch being appointed to the Supreme Court. His response seems to indicate he thinks Trump nominees are still worthy of consideration. That’s an absurd stance since they have been found to be lying under oath during the confirmation process.

Here is Senator Bennet’s response:

Dear Gregory:

Thank you for contacting me regarding the current vacancy on the Supreme Court.

On January 31, 2017, President Trump nominated Judge Neil Gorsuch to serve on the Supreme Court. He has served on the 10th Circuit Court of Appeals since 2006.  Gorsuch clerked for Judge David Sentelle on the U.S. Court of Appeals for the D.C. Circuit and for Supreme Court Justices Byron White and Anthony Kennedy. He also served as the Principal Deputy to the Associate Attorney General at the U.S. Department of Justice.

I take seriously the Senate’s constitutional responsibility to thoroughly vet Judge Gorsuch. I intend to review his record carefully in the coming weeks. Rest assured, I will keep your thoughts and concerns in mind throughout the confirmation process.

I value the input of fellow Coloradans in considering the wide variety of important issues and legislative initiatives that come before the Senate. I hope you will continue to inform me of your thoughts and concerns.

For more information about my priorities as a U.S. Senator, I invite you to visit my website at http://bennet.senate.gov/. Again, thank you for contacting me.

Below is my response


Senator, thanks for your response.

He has served on the 10th Circuit Court of Appeals since 2006.  Gorsuch clerked for Judge David Sentelle on the U.S. Court of Appeals for the D.C. Circuit and for Supreme Court Justices Byron White and Anthony Kennedy. He also served as the Principal Deputy to the Associate Attorney General at the U.S. Department of Justice.
 
None of this matters. You should be working to delay each and every nomination by Trump because the longer we delay them the less damage he can cause.
 
There should be no more confirmations until Sessions has resigned for lying under oath during his confirmation process. How can you trust anything that any Trump nominee has said in any hearing until it is clear that they take not lying to Congress seriously?

Alas… it seems he doesn’t want to accept email responses…
Your message was sent to a non-monitored mailbox and has not been reviewed. If you would like to contact Senator Michael Bennet please visit his website at http://bennet.senate.gov/contact and fill out the webform for a prompt response. Thank you.
So much for dialog… Hey look I have a blog…

Top Five Posts from 2016

Most of this blog’s 40k visitors a year are looking at the epic Elasticsearch posts that I wrote years ago. For the most part they seem to still be relevant to people even if they are somewhat outdated. Here are my top posts with some commentary about each of them.

2014-emailteaser

1: Elasticsearch: Five Things I was Doing Wrong

79% of my traffic comes from search engines, and almost 50% of all traffic goes to this one post. It’s actually kinda crazy that such a simple post gets so much of my traffic. I blame the clickbait headline. I have a bunch of long winded epic posts and what I should probably be writing is these small tidbits as they come up.

2: Three Principles for Multilingal Indexing in Elasticsearch

This is my all time favorite post. After 2.x and the removal of being able to specify an analyzer in a query it has become a bit outdated, but the overall concepts are still good. I love all the comments this post has generated. I’ve learned so much from this post and the discussions that it generated. We’ve accomplished a lot the past year to adjust our multi-lingual indexing (deployed edgengrams into an A/B test yesterday) and I’m hoping to write up what my latest thinking is soon.

3 and 4: Scaling Elasticsearch Series

The first two parts of this three part series are my third and fourth most popular posts. The indexing post is almost twice as popular as the intro and querying posts. Although these posts are almost three years old now they still describe pretty well how we scale most of our queries. Most of the reason why these posts haven’t been updated is because the methods they describe have worked really well for us.

The original post talks about having 600 million posts in the index and 23m queries a day. We now have 4.3 billion posts and do about 45m queries a day. That’s some good scaling.

Only over the past year have we started to see some problems slowly develop with our global cluster scaling. Currently the cluster runs fine for about a month or so and then heap usage creeps upwards until it starts to cause problems. The solution is just to do rolling restart of the cluster. Not pretty, but it works. Here’s what our average heap usage looks like broken down by data center for the past 30 days.

Screen Shot 2017-01-06 at 9.28.48 AM.png

We think a lot of these are just memory management bugs in the old Elasticsearch version we have been running for years and are hopeful that as we transition to 2.x many of them will be resolved. The other option is just to add more servers which we haven’t done in a few years. Our typical load is not very high though until we reach the point of running out of heap so I haven’t felt very justified in ordering more servers for this cluster yet.

One high point of this cluster is it taught us how to run a multi data center cluster. Every cluster we deploy now is multi-data center and we have successfully survived cases where an entire data center goes down. Currently we are in three data centers spread across the US. It’s likely that in 2017 we will start trying to run intercontinental Elasticsearch clusters (Europe and the US). Should be exciting.

5: Managing Elasticsearch Cluster Restart Time

This post describes how we manage long restart times. 2.x is a bit faster in this regard, but still takes a while to synchronize, so this is still relevant to managing a production ES cluster.

 

 

Notes from The Printing Revolution

I’ve been reading The Printing Revolution in Early Modern Europe by Elizabeth Eisenstein on the transition of Europe from a scribal culture to a printing culture. In referencing Michael Clapham:

A man born in 1453, the year of the fall of Constantinople, could look back from his fiftieth year on a lifetime in which about eight million books had been printed, more perhaps than all the scribes of Europe had produced since Constantine founded his city in A.D. 330.

The printing press had an immense and hard to correlate impact on the past 600 years. How will we build the tools and discover the norms that will shape the next 600 years? Here are some quotes that jumped out at me from the third chapter on the “features of print culture”.

“Increased output and altered intake”

To consult different books it was no longer so essential to be a wandering scholar. Successive generations of sedentary scholars were less apt to be engrossed by a single text and expend their energies in elaborating on it. The era of the glossator and commentator came to an end, and a new “era of intense cross referencing between one book and another” began.

Merely by making more scrambled data available, by increasing the output of Aristotelian, Alexandrian, and Arabic texts, printers encouraged efforts to unscramble these data. Some medieval coastal maps had long been more accurate than many ancient ones, but few eyes had seen either.

Contradictions became more visible, divergent traditions more difficult to reconcile.

Printing encouraged forms of combinatory activity which were social as well as intellectual. It changed relationships between men of learning as well as between systems of ideas.

The new wide-angled, unfocused scholarship went together with a new single-minded, narrowly focused piety. At the same time, practical guidebooks and manuals also became more abundant, making it easier to lay plans for getting ahead in this world – possibly diverting attention from uncertain futures in the next one.

“Considering some effects produced by standardization”

The very act of publishing errata demonstrated a new capacity to locate textual errors with precision and to transmit this information simultaneously to scattered readers.

Sixteenth-century publications not only spread identical fashions but also encouraged the collection of diverse ones.

Concepts pertaining to uniformity and to diversity – to the typical and to the unique – are interdependent. They represent two sides of the same coin. In this regard one might consider the emergence of a new sense of individualism as a by-product of the new forms of standardization.

It’s interesting to think of printing and the uniformity it spread as the birth of individualism. Does the internet, by connecting highly dispersed but like minded people into tight niches bring about a reduction of individualism? Bubbles of conformity within your tribe?

no precedent existed for addressing a large crowd of people who were not gathered together in one place but were scattered in separate dwellings and who, as solitary individuals with divergent interests, were more receptive to intimate interchanges than to broad-gauged rhetorical effects.

There is simply no equivalent in scribal culture for the “avalanche” of “how-to” books which poured off the new presses

“Reorganizing texts and reference guides: rationalizing, codifying, and cataloguing data”

printers with regard to layout and presentation probably helped to reorganize the thinking of readers.

Basic changes in book format might well lead to changes in thought patterns… For example, printed reference works encouraged a repeated recourse to alphabetical order.

Alphabetical ordering. The simplest of sorting algorithms. Today it seems that has been taken by reverse chronological sorting. Only the most recent thing is important.

“From the corrupted copy to the improved edition”

A sequence of printed herbals beginning in the 1480s and going to 1526 reveals a “steady increase in the amount of distortion,” with the final product – an English herbal of 1526 – providing a “remarkably sad example of what happens to visual information as it passed from copyist to copyist.” … data tended to get garbled at an ever more rapid pace. But under the guidance of technically proficient masters, the new technology also provided a way of transcending the limits which scribal procedures had imposed upon technically proficient masters in the past.

fresh observations could at long last be duplicated without being blurred or blotted out over the course of time.

“Considering the preservative powers of print: fixity and cumulative change”

Of all the new features introduced by the duplicative powers of print, preservation is possibly the most important.

as edicts became more visible, they also became more irrevocable. Magna Carta, for example, was ostensibly “published”

Copying, memorizing, and transmitting absorbed fewer energies.

“Amplification and reinforcement: the persistence of stereotypes and of sociolinguistic divisions”

Both “stereotype” and “cliché” are terms deriving from typographical processes developed three and a half centuries after Gutenberg.

an unwitting collaboration between countless authors of new books and articles. For five hundred years, authors have jointly transmitted certain old messages with augmented frequency even while separately reporting on new events or spinning out new ideas.

 

Eisenstein, Elizabeth L. (2012-03-29). The Printing Revolution in Early Modern Europe (Canto Classics). Cambridge University Press. Kindle Edition.

Bash Your Day

For the past six months or so I’ve used a bash script to start my morning. The genesis of this idea was that I wanted to be more effective at prioritizing reviewing other peoples’ code and writing my own code. Assuming the number of hours one puts into work is fixed (because that is the only way for life to be sustainable), that means that I need to cut back on other things. The obvious things to minimize are time spent in Slack, on P2s (Automattic’s internal blogs), and time spent being distracted in the middle of the day because I can’t focus for whatever reason.

I decided to adapt the idea of reducing the number of decisions to make inspired a bit by Obama always wearing the same suit to reduce cognitive load. Rather than randomly looking at slack/p2s/email in the morning and trying to navigate how to go from one to then next, all I need to do is start a script, and it walks me methodically through everything I think I need to look at in the morning before I can really get to work coding. Eliminate having to think about what to look at and have some gentle reminders to prevent me from getting stuck doing things that aren’t really that important.

Screen Shot 2016-12-12 at 9.34.27 AM.png

The Script Basics

There are three fundamental functions in my script:
1. i_do_say(): Sends a message to the OS X notifications system and to the say command (advanced usage: speed up the rate and change the voice)
2. timed_msg(): sends a string to i_do_say(), then it waits the specified amount of time. The timer can be interrupted by hitting any key.
3. timed_confirm(): repeatedly calls i_do_say() every X seconds, prompting me to finish up and move on. There is an initial message, and then a second message that repeats indefinitely. I can type ‘d’ to add an extra 5 minutes of delay to the time.

The timed_msg() and timed_confirm() commands can be chained with && so if you cancel out of one then the following commands will be skipped. This makes it easy to have multiple messages in a group, but easily jump to the next thing when you are done. Here’s a good example:

timed_msg "Start five minutes of slack and IRC" 3 &&
timed_msg "Finish slack and IRC in two minutes" 2 &&
timed_confirm "Ready for alerts?" "Does today look like a catch up morning?" 180

I mix these commands along with some others like:

  • Opening a link in my email open https://inbox.google.com/u/0/
  • Closing slack: osascript -e 'quit app "Slack"'
  • Opening emacs: osascript -e 'activate application "emacs"'
  • Starting music: osascript -e 'tell application "iTunes"' -e 'set new_playlist to "Coding" as string' -e "play playlist new_playlist" -e "end tell"

The full repo of my scripts are in https://github.com/gibrown/bash-my-day

I have a reasonably complicated script for my morning with a couple options depending on what I am trying to work on. Some days you just know there is a lot of communication to catch up on, so the idea of quitting slack really won’t work, but my default is to get through Slack/P2s/Alerts/Planning/Meditation in about 30-40 minutes and then close Slack and start on reviewing other people’s code. Optimize for unblocking other people and getting myself to work on writing code (which typically is also about enabling someone else). The morning is also when I am the freshest, so it’s a good time to be coding. Once I get more tired in the afternoon I can get back to closing all those tabs I opened in the morning.

I also have a few other scripts:

  • Ending my day to set myself up for being successful tomorrow (close tabs!)
  • Lunch time I usually eat at my desk, good time to empty email and close tabs
  • A script to get me to close out slack again and get back to coding

Most days I find I only use the morning script. I still need to experiment more with whether the other scripts are helpful.

My Morning

You can see all of this in my morning script, but I thought I would run through my thinking:

  • 5 min: Slack and IRC: if something needs a lot of attention this is usually how I will find out. My main goal here though is to find things that need to be prioritized
  • 2 min: Look at alerts in my email: again, check for any fires
  • 5 min: open p2s, see if there are any I really need to respond to. This is my biggest weakness, ts hard to put off responding until later in the afternoon.
  • 5 min: looking at my to do list and choosing what to work on (all in org-mode in emacs)
  • 2-10 min: Meditation – recently added. it’s ramping up over the next few weeks from 2 minutes to start building the habit up to 10 minutes a day where I’d like to be. The ramp up is all scripted too: see timed_weekly_ramp(). I’ve tried fitting meditation into my day a few different ways before with not great success. I thought this may be good way to build the habit.
  • 5 min: Journal – I do a better job working on the important things if I do some amount of reflection. Recently this has been turning into a blog post which I’d like to make more often.
  • 2 min: write a standup report (2-5 lines) for my team about what I did yesterday and what I’m doing today
  • Music starts. I start reviewing code. Script prompts (repeatedly) me to close Slack. Eventually I do.
  • Move on to writing my own code. Prompt is just to write for 10 minutes. Once I get started I don’t tend to stop. Getting started in the face of distractions is usually the biggest problem.

Last week I also added a few options such as explicitly having catch up mornings when I have been away from work for a few days. Rather than pushing me into coding, those focus on getting me to clean out my inbox, and respond to p2s. Some days it is better to just admit to myself that I need to focus on communication rather than coding.

Results

So what impact did this have? Well, thanks to using Rescuetime to track my time over the past five years I have some pretty good data on that. I have my time broken into 5 categories: Most Productive (coding and code review), Productive (Slack, P2s, email), Neutral (mostly random websites), Distracting (Twitter), and Most Distracting (HackerNews, Talking Points Memo, other sites I refresh too often).

By comparing the past 6 months to the 6 months before that I can more or less see how much having these scripts impacted the percentage of time I spend in each of these categories:

  • Most Productive: up 27.7% (more coding and code review!)
  • Productive: down 14% (yay, less Slack!) Went from being 48% of my time down to 41%.
  • Neutral: down 22.6% (a lot fewer random sites!)
  • Distracting: down 13%
  • Most Distracting: down 9%

These metrics aren’t perfect, since what I work on varies over time. When I’m working on lots of hiring there is a lot more time spent on communication for instance. Its a little hard to correct for variations, but I’ve tried to look at the data for some shorter periods also and I get similar results.

I’m pretty convinced that using a script to push me through the morning both makes me feel more productive, and it does have a big impact on how I spend my time.