LeetCode: Words From Characters, and Benchmarking Solutions
Article Table of Contents
I recently worked through a LeetCode problem.
The first run was pretty brutal. It took (what felt like) forever, and I was not content with my solution.
Even better, it passed the test cases given while building the solution, but failed on submission.
So, once I fixed it so it didn’t fail on submission… it then timed out. My code was so inefficient leetcode’s test suite took too long to finish.
I have a few learning objectives for doing this kind of code challenge:
- Improve my knowledge of Ruby’s API
- Get better at benchmarking code “snippets”.
- Refresh/document my “process” for learning new things about a programming language, for the benefit of Turing students who are attacking their own code challenges.
The challenge and my (not quite working) solution. #
Here’s the given instructions:
You are given an array of strings
words
and a stringchars
.A string is good if it can be formed by characters from
chars
(each character can only be used once).Return the sum of lengths of all good strings in words.
Example 1:
Input: words = ["cat","bt","hat","tree"], chars = "atach" Output: 6 Explanation: The strings that can be formed are "cat" and "hat" so the answer is 3 + 3 = 6.
Example 2:
Input: words = ["hello","world","leetcode"], chars = "welldonehoneyr" Output: 10 Explanation: The strings that can be formed are "hello" and "world" so the answer is 5 + 5 = 10.
Note:
1 <= words.length <= 1000 1 <= words[i].length, chars.length <= 100 All strings contain lowercase English letters only.
There’s a few things I struggled with; first, I was doing all of this in the LeetCode UI; I didn’t get to use pry
to pause code execution and quickly poke around internal state; I had to rely on liberal use of print
statements, which is decidedly un-fun.
This lovely 13 lines of code took me probably a full hour. Yikes:
def count_characters(words, chars)
found_words = []
words.each do |w|
found_words << w if word_chars_in_approve_list(w, chars)
end
return found_words.join.length
end
def word_chars_in_approve_list(word, chars)
word.split("").all? do |c|
chars.include?(c)
end
end
I hit the “submit” button… it didn’t pass. In the word_chars_in_approve_list
method, I’m only checking to see that all of the characters in word
also exist in the chars
string.
So if
word = "bbb"
chars = "b"
the method returned true, even though it shouldn’t have - per the instructions:
each character can only be used once
So for the word
of bbb
, it would be valid only if the chars
list had at least 3
instances of b
in it:
word = "bbb"
chars = "bbb"
So, I worked a bit with Kunal Madhav from CodeMentor (I had a variety of questions for him, among them was this challenge and my general strategy for solving technical challenges. He was great. You should talk with him.)
Together, we eventually ironed out the following:
def count_characters(words, chars)
found_words = []
words.each do |w|
if word_chars_in_approve_list?(w, chars)
found_words << w
end
end
return found_words.join.length
end
def word_chars_in_approve_list?(word, chars)
word.split("").each do |c|
if chars.include?(c)
chars = remove_first_instance_of(c, chars)
else
return false
end
end
return true
end
def remove_first_instance_of(c, chars)
split_chars = chars.split("")
index = split_chars.index(c)
split_chars.delete_at(index)
return split_chars.join
end
Lots of other solutions available here
But this code was timing out; this took too long to run. So, I figured out a modest performance improvement. Rather than splitting and joining the word so many times, I did it just once:
def count_characters(words, chars)
found_words = []
words.each do |w|
if word_chars_in_approve_list?(w, chars)
found_words << w
end
end
return found_words.join.length
end
def word_chars_in_approve_list?(word, chars)
list = chars.split("")
word.split("").each do |c|
if list.include?(c)
index = list.index(c)
list.delete_at(index)
else
return false
end
end
return true
end
All the action is happening in the word_chars_in_approve_list?
method.
The output from submitting it on LeetCode:
Runtime: 832 ms, faster than 18.18% of Ruby online submissions for
Find Words That Can Be Formed by Characters
. Memory Usage: 10.2 MB, less than 100.00% of Ruby online submissions forFind Words That Can Be Formed by Characters
.
So, the above code worked, but I didn’t really feel like it was good code.
The next day, I wanted to revisit this. I recommend to Turing students all the time to write code down in a notebook, and re-build the solution/class/whatever from their notebook.
So, I’m taking a dose of my own medicine:
I made some improvements from a readability perspective:
def count_characters(words, chars)
words.reduce(0) do |results, word|
results += word.length if word_in_chars?(word, chars)
results
end
end
def word_in_chars?(word, chars)
char_list = chars.split("")
word.split("").each do |c|
if char_list.include?(c)
char_list.delete_at(char_list.index(c))
next
end
return false
end
true
end
And the above solution vs. the below solution is what I want to benchmark:
def count_characters(words, chars)
found_words = []
words.each do |w|
if word_chars_in_approve_list?(w, chars)
found_words << w
end
end
return found_words.join.length
end
def word_chars_in_approve_list?(word, chars)
word.split("").each do |c|
if chars.include?(c)
chars = remove_first_instance_of(c, chars)
else
return false
end
end
return true
end
def remove_first_instance_of(c, chars)
split_chars = chars.split("")
index = split_chars.index(c)
split_chars.delete_at(index)
return split_chars.join
end
Benchmarking Code Snippets #
LeetCode offers pretty detailed benchmarking, but I wanted to see exactly how bad my “too slow” solution was.
So, I spun up a little benchmark.
In ruby, you can do something like this:
require 'benchmark'
def count_characters(words, characters)
# first version of code you want to test
end
def count_characters_medium(words, characters)
# second version of code to test
end
def count_characters_fast(words, characters)
# third version of code to test
end
Benchmark.bmbm do |x|
iterations = 30000
words = ["cat", "hat", "ball", "sasafras"]
characters = "catbl"
x.report("slower") {
iterations.times do |foo|
count_characters(words, characters)
end
}
x.report("medium") {
iterations.times do |foo|
count_characters_medium(words, characters)
end
}
x.report("faster") {
iterations.times do |foo|
count_characters_fast(words, characters)
end
}
end
Include the code you want to test before declaring Benchmark
, and this benchmark code will run the given arguments through the associated methods.
When I used small strings, it was not easy to tell which code was faster. I was running it 30000+ times, and was getting output showing only a small difference.
But then I ran it on the input LeetCode was providing, and I got much more interesting results:
Rehearsal -------------------------------------------
slower 13.530000 0.080000 13.610000 ( 14.094186)
medium 2.120000 0.020000 2.140000 ( 2.453419)
faster 1.700000 0.010000 1.710000 ( 1.725038)
--------------------------------- total: 17.460000sec
user system total real
slower 13.160000 0.050000 13.210000 ( 13.371204)
medium 1.740000 0.010000 1.750000 ( 1.759786)
faster 1.810000 0.010000 1.820000 ( 1.888091)
My “slow” code was taking a bit more than seven times as long to run as the “faster” code. That’s a noteworthy improvement.