Josh Thompson     about     archive     turing     office hours

LeetCode: Words From Characters, and Benchmarking Solutions

I recently worked through a LeetCode problem.

The first run was pretty brutal. It took (what felt like) forever, and I was not content with my solution.

Even better, it passed the test cases given while building the solution, but failed on submission.

So, once I fixed it so it didn’t fail on submission… it then timed out. My code was so inefficient leetcode’s test suite took too long to finish.

I have a few learning objectives for doing this kind of code challenge:

  • Improve my knowledge of Ruby’s API
  • Get better at benchmarking code “snippets”.
  • Refresh/document my “process” for learning new things about a programming language, for the benefit of Turing students who are attacking their own code challenges.

The challenge and my (not quite working) solution.

Here’s the given instructions:

You are given an array of strings words and a string chars.

A string is good if it can be formed by characters from chars (each character can only be used once).

Return the sum of lengths of all good strings in words.

Example 1:

Input: words = ["cat","bt","hat","tree"], chars = "atach"
Output: 6
Explanation: 
The strings that can be formed are "cat" and "hat" so the answer is 3 + 3 = 6.

Example 2:

Input: words = ["hello","world","leetcode"], chars = "welldonehoneyr"
Output: 10
Explanation: 
The strings that can be formed are "hello" and "world" so the answer is 5 + 5 = 10.

Note:

1 <= words.length <= 1000
1 <= words[i].length, chars.length <= 100
All strings contain lowercase English letters only.

There’s a few things I struggled with; first, I was doing all of this in the LeetCode UI; I didn’t get to use pry to pause code execution and quickly poke around internal state; I had to rely on liberal use of print statements, which is decidedly un-fun.

This lovely 13 lines of code took me probably a full hour. Yikes:

def count_characters(words, chars)
  found_words = []
  words.each do |w|
    found_words << w if word_chars_in_approve_list(w, chars)
  end  
  return found_words.join.length
end

def word_chars_in_approve_list(word, chars)
  word.split("").all? do |c|
    chars.include?(c)
  end
end

I hit the “submit” button… it didn’t pass. In the word_chars_in_approve_list method, I’m only checking to see that all of the characters in word also exist in the chars string.

So if

word = "bbb"
chars = "b"

the method returned true, even though it shouldn’t have - per the instructions:

each character can only be used once

So for the word of bbb, it would be valid only if the chars list had at least 3 instances of b in it:

word = "bbb"
chars = "bbb"

So, I worked a bit with Kunal Madhav from CodeMentor (I had a variety of questions for him, among them was this challenge and my general strategy for solving technical challenges. He was great. You should talk with him.)

Together, we eventually ironed out the following:

def count_characters(words, chars)
    found_words = []
    words.each do |w|
        if word_chars_in_approve_list?(w, chars)
            found_words << w 
        end
    end
    return found_words.join.length
end

def word_chars_in_approve_list?(word, chars)
    word.split("").each do |c|
        if chars.include?(c)
            chars = remove_first_instance_of(c, chars)
        else
            return false
        end
    end
    return true
end

def remove_first_instance_of(c, chars)
    split_chars = chars.split("")
    index = split_chars.index(c)
    split_chars.delete_at(index)
    return split_chars.join
end

Lots of other solutions available here

But this code was timing out; this took too long to run. So, I figured out a modest performance improvement. Rather than splitting and joining the word so many times, I did it just once:

def count_characters(words, chars)
  found_words = []
  words.each do |w|
  if word_chars_in_approve_list?(w, chars)
    found_words << w 
  end
  end
  return found_words.join.length
end

def word_chars_in_approve_list?(word, chars)
  list = chars.split("")
  word.split("").each do |c|
    if list.include?(c)
      index = list.index(c)
      list.delete_at(index)
    else
      return false
    end
  end
  return true
end

All the action is happening in the word_chars_in_approve_list? method.

The output from submitting it on LeetCode:

Runtime: 832 ms, faster than 18.18% of Ruby online submissions for Find Words That Can Be Formed by Characters. Memory Usage: 10.2 MB, less than 100.00% of Ruby online submissions for Find Words That Can Be Formed by Characters.

So, the above code worked, but I didn’t really feel like it was good code.


The next day, I wanted to revisit this. I recommend to Turing students all the time to write code down in a notebook, and re-build the solution/class/whatever from their notebook.

So, I’m taking a dose of my own medicine:

I made some improvements from a readability perspective:

def count_characters(words, chars)
  words.reduce(0) do |results, word|
    results += word.length if word_in_chars?(word, chars)
    results
  end
end

def word_in_chars?(word, chars)
  char_list = chars.split("")
  word.split("").each do |c|
    if char_list.include?(c)
      char_list.delete_at(char_list.index(c))
      next
    end
    return false
  end
  true
end

And the above solution vs. the below solution is what I want to benchmark:

def count_characters(words, chars)
    found_words = []
    words.each do |w|
        if word_chars_in_approve_list?(w, chars)
            found_words << w 
        end
    end
    return found_words.join.length
end

def word_chars_in_approve_list?(word, chars)
    word.split("").each do |c|
        if chars.include?(c)
            chars = remove_first_instance_of(c, chars)
        else
            return false
        end
    end
    return true
end

def remove_first_instance_of(c, chars)
    split_chars = chars.split("")
    index = split_chars.index(c)
    split_chars.delete_at(index)
    return split_chars.join
end

Benchmarking Code Snippets

LeetCode offers pretty detailed benchmarking, but I wanted to see exactly how bad my “too slow” solution was.

So, I spun up a little benchmark.

In ruby, you can do something like this:

require 'benchmark'

def count_characters(words, characters)
  # first version of code you want to test
end

def count_characters_medium(words, characters)
  # second version of code to test
end

def count_characters_fast(words, characters)
  # third version of code to test
end

Benchmark.bmbm do |x|
  iterations = 30000
  words = ["cat", "hat", "ball", "sasafras"]
  characters = "catbl"

  x.report("slower") { 
    iterations.times do |foo|
      count_characters(words, characters)
    end
  }

  x.report("medium") { 
    iterations.times do |foo|
      count_characters_medium(words, characters)
    end
  }
  
  x.report("faster") { 
    iterations.times do |foo|
      count_characters_fast(words, characters)
    end
  }
end

Include the code you want to test before declaring Benchmark, and this benchmark code will run the given arguments through the associated methods.

When I used small strings, it was not easy to tell which code was faster. I was running it 30000+ times, and was getting output showing only a small difference.

But then I ran it on the input LeetCode was providing, and I got much more interesting results:

Rehearsal -------------------------------------------
slower   13.530000   0.080000  13.610000 ( 14.094186)
medium    2.120000   0.020000   2.140000 (  2.453419)
faster    1.700000   0.010000   1.710000 (  1.725038)
--------------------------------- total: 17.460000sec

              user     system      total        real
slower   13.160000   0.050000  13.210000 ( 13.371204)
medium    1.740000   0.010000   1.750000 (  1.759786)
faster    1.810000   0.010000   1.820000 (  1.888091)

My “slow” code was taking a bit more than seven times as long to run as the “faster” code. That’s a noteworthy improvement.

Additional Resources

Book Notes: 'Why We Get Fat' by Gary Taube

I recently read Why We Get Fat, by Gary Taubes. I read it shortly after reading The Case Against Sugar. My notes and a write-up on The Case Against Sugar

As I explained in that post, I find it helpful to do a ‘deep dive’ on some of the books I want to be deeply influenced by. For a variety of reasons, I have a tolerably high degree of physical fitness. I’d like to retain this fitness for a long time. Much of the fitness equation is “don’t have extra body fat”, so to that end, I’ve been doing reading on the topic.

Given how prevalent being overweight or obese is in our society, the small investment of reading a few books that might point one towards a proper course of action has ludicrously high returns.

Notes on formatting

I will be freely interjecting my thoughts throughout the rest of this review. When I quote the author, it will be denoted like such:

These are words the author has written

Sometimes they’ll extend for several paragraphs

I may choose to italicize or make bold portions of the quotations from the books. In general, these are my decisions, not the author’s.

If there are

[words wrapped in brackets]

it’s me editorializing to load the context of the quote.

My collected notes from Why we get fat: and what to do about it


Getting fat is not a “calories-in/calories-out” problem, or “eat too much, exercise too little” problem. It’s a hormonal regulation problem.

In honor of the laws of thermodynamics that they’re replacing, we’ll call these the laws of adiposity. The First Law:

Body fat is carefully regulated, if not exquisitely so.

This is true even though some people fatten so easily that it’s virtually impossible to imagine.

What I mean by “regulated” is that our bodies, when healthy, are working diligently to maintain a set amount of fat in our fat tissue — not too much and not too little — and that this, in turn, is used to assure a steady supply of fuel to the cells.

The implication (our working assumption) is that if someone gets obese it’s because this regulation has been thrown out of whack not that it’s ceased to exist.


>> Read more

Book Notes: 'The Case Against Sugar' by Gary Taube

In the last few weeks, I read The Case Against Sugar by Gary Taubes.

I found it to be compelling (more on that in a moment) and I want to be impacted by them. I want the daily decisions that I make to be subtly influenced by this author and these books.

Related but in a different vein, Nat Ellison has his collection of book notes. Derek Sivers has his. Patrick Collison has a list of books he recommends. I’ve got my own list of recommended books, but I’ve wanted to dive a bit deeper on some of them. So, like Nat Ellison and others, I’m grab-bagging quotes (helpfully brought over from highlights on my Kindle) and some thoughts interspersed between.

I’ve not settled (yet) on a format I like, but as in most things, this is an iterative process. These notes may be useful to others (at least to help them decide if the book is worth reading) but primarily this is a helpful process to me.

After reading The Case Against Sugar, I read (and did a write-up of) Why We Get Fat: And What To Do About It

Notes on formatting

I’ve included broad quotes from the book; headings, non-quoted text, bold/italicized text, etc all my addition to help with the skimming, unless I indicate otherwise.

The Case Against Sugar

The dominant views of obesity and weight have problems

Starting this out with a strong shot across the bow, Taubes argues that the “modern” understanding about why we get fat falls into two dominant approaches, and both are catastrophically wrong.

Since the 1930s, to summarize briefly, nutritionists have embraced two ideas that ultimately shaped our judgments about what constitutes a healthy diet. These would be the pillars on which the foundation of nutritional wisdom about the impact of foods — including sugar — on obesity, diabetes, heart disease, and other chronic diseases would be based. They were both products of the state of the science of the era; they were both misconceived, and they would both do enormous damage to our understanding of the diet-disease relationship and, as a result, the public health.

The first idea was that the fat in our diets causes the chronic diseases that tend to kill us prematurely in modern Western societies.

At its simplest, this focus on dietary fat — specifically from butter, eggs, dairy, and fatty meats — emerged from a concept that is now known as a nutrition transition: As populations become more affluent and more urban, more “Westernized” in their eating habits and lifestyle, they experience an increased prevalence of these chronic diseases. Almost invariably, the dietary changes include more fat consumed (and more meat) and fewer carbohydrates…

The second pillar of modern nutritional wisdom is far more fundamental and ultimately has had far more influence on how the science has developed, and it still dominates thinking on the sugar issue. As such, it has also done far more damage. To the sugar industry, it has been the gift that keeps on giving, the ultimate defense against all arguments and evidence that sugar is uniquely toxic. This is the idea that we get obese or overweight because we take in more calories than we expend or excrete.

By this thinking, researchers and public-health authorities think of obesity as a disorder of “energy balance,” a concept that has become so ingrained in conventional thinking, so widespread, that arguments to the contrary have typically been treated as quackery, if not a willful disavowal of the laws of physics.

According to this logic of energy balance, of calories-in/calories-out, the only meaningful way in which the foods we consume have an impact on our body weight and body fat is through their energy content — calories. This is the only variable that matters. We grow fatter because we eat too much — we consume more calories than we expend — and this simple truth was, and still is, considered all that’s necessary to explain obesity and its prevalence in populations.

This thinking renders effectively irrelevant the radically different impact that different macronutrients—the protein, fat, and carbohydrate content of foods—have on metabolism and on the hormones and enzymes that regulate what our bodies do with these foods: whether they’re burned for fuel, used to rebuild tissues and organs, or stored as fat.

By this energy-balance logic, the close association between obesity, diabetes, and heart disease implies no profound revelations to be gleaned about underlying hormonal or metabolic disturbances, but rather that obesity is driven, and diabetes and heart disease are exacerbated, by some combination of gluttony and sloth.

It implies that all these diseases can be prevented, or that our likelihood of contracting them is minimized if individuals - or populations - are willing to eat in moderation and perhaps exercise more, as lean individuals are assumed to do naturally. Despite copious reasons to question this logic and, as we’ll see, an entire European school of clinical research that came to consider it nonsensical, medical and nutrition authorities have tended to treat it as gospel.

>> Read more

How to Run Your Rails App in Profiling Mode

Last time, I wrote about setting up DataDog for your Rails application. Even when “just” running the app locally, it is sending data to DataDog.

This is super exciting, because I’m getting close to being able to glean good insights from DataDog’s Application Performance Monitoring tool.

For a variety of reasons, I want to run DataDog against the app as it is running locally, on my laptop. This will scale up to monitoring all this in production, but for now, I can rapidly experiment, and since we’re not deploying anything (yet) I can freely experiment with gathering/interpreting all this data locally.

The problem with running the app locally is it’s usually running on development mode, which means Rails does lots of stuff to make local development easier, but which makes actual page load time take longer.

Nate covered how to configure your app to run it in a “production-like” environment, locally, but I got tripped up in some of the minor details involved with porting generalized instructions to our specific codebase.

So, today, I’m going to explain how to run the app locally in a way that mimics production. Some of this will be specific to our app, but I some of it could be useful to anyone else with a Rails app

Required changes to config/development.rb

Nate suggested setting these options in development.rb:

# config/environments/development.rb
config.cache_classes = true
config.eager_load = true
config.serve_static_files = true # 4.2 or less
# config.public_file_server.enabled = true # 5.0 or more
config.assets.compile = false
config.assets.digest = true
config.active_record.migration_error = false
>> Read more

Setting up Application Performance Monitoring in DataDog in your Rails App

When I write guides to things, I write them first and foremost for myself, and I tend to work through things in excruciating detail. You might find this to be a little too in-depth, or you might appreciate the detail. Either way, if you want a step-by-step guide, this should do it.

As mentioned in Load Testing your app with Siege, I want to get Application Performance Monitoring in place for our app. We use Rails.

To someone who doesn’t know much about this space, there was more friction than I expected.

I’ll wander around, get everything set up, and outline any places that I bump into stuff.

First, DataDog has a handy video about how to configure your app to use APM:

https://docs.datadoghq.com/videos/rails/

Unfortunately, the link they give in the video to https://app.datadoghq.com/apm/docs redirects to https://app.datadoghq.com/apm/intro, which fails to tell you how to find the actual docs. 🙄

So, I found https://docs.datadoghq.com/tracing/, which don’t look anything like what’s in the video. Docs go out of date, this isn’t a problem.

It looks like the setup involves a few steps:

  1. Enable Trace collection in the Datadog Agent
  2. Instrument your application to send traces to your Datadog Agent
  3. Enable Trace Search & Analytics
  4. Enrich tracing

I don’t know that we’ll get to step 4, but 1-3 look great.

Enable Trace collection in the Datadog Agent

There’s a lot of complexity boiled into this short little paragraph:

Install and configure the latest Datadog Agent. (On macOS, install and run the Trace Agent in addition to the Datadog Agent. See the macOS Trace Agent documentation for more information). APM is enabled by default in Agent 6, however there are additional configurations to be set in a containerized environment including setting apm_non_local_traffic: true. To get an overview of all the possible settings for APM including setting up APM in containerized environments such as Docker or Kubernetes, get started Sending traces to Datadog.

>> Read more